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SUMMARY 


We  consider  regularized  linear  algebraic  methods  for  the  restoration  of  large  images 
(perhaps  1000  x  1000  pixels)  derived  from  space-based  surveillance  systems.  The 
associated  computational  burdens  are  assessed.  Methods  which  accelerate  the  restoration 
process  by  exploiting  structure  within  the  imaging  matrix  are  analyzed.  It  is  shown  that,  if 
relaxation  of  the  support  of  the  reconstruction  to  that  of  the  original  image  can  be  tolerated, 
a  technique  based  on  expansion  of  the  imaging  matrix  to  circulant  form  can  be  applied 
which  is  significantly  faster  and  makes  much  smaller  demands  on  storage  resources. 
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1.  The  Discretized  Problem 


We  wish  to  form  an  estimate  of  an  object  /,  given  the  image  g  and  imaging  operator  A  . 
We  assume  additive  noise  is  present  and  write,  symbolically, 

g^Af+n  (1) 


Typically,  ^4  is  a  strongly  smoothing  operator  -  for  example,  a  convolution  in  the  case  of  a 
space-invariant  point-spread  function  -  and  noise  introduces  a  destabilizing  effect  on  any 
attempt  at  a  solution.  In  the  continuous  form  of  the  problem,  equation  (1)  becomes  a 
Fredholm  equation  of  the  first  kind,  and  a  singular  function  analysis  can  be  used^  to  identify 
the  precise  nature  of  this  difficulty.  In  the  discrete  case,  .<4  is  a  matrix,  / ,  g  and  n  are 
vectors,  and  the  singular  value  decomposition  (SVD)  of  A  can  be  used  in  a  similar  way\ 
We  shall  see  that  the  SVD  also  provides  an  elegant  technique  for  deriving  a  solution. 


In  the  presence  of  noise,  an  exact  solution  to  equation  (1)  is  impossible  to  achieve.  As  a 

means  of  choosing  an  approximation  from  the  infinite  set  |/|  of  estimates,  we  might 

require,  for  example,  that  the  Euclidean  distance  between  the  image  of  the  estimate  and  the 
given  image  data  be  a  minimum. 


Thus  we  would  minimize 


I  A  ^ 

Af-  g  and  obtain  the  normal  equations 
i  2 


where  A^  is  the  Hermitian  conjugate  of  A  ;  inversion  of  A^A  (assuming  that  it  has  an 
inverse)  then  yields  the  estimate  fest .  It  can  be  shown,  however,  that  this  estimator  is  still 
extremely  sensitive  to  noise,  small  perturbations  in  the  data  causing  large  changes  in  the 
solution.  The  problem  is  intrinsically  ill-posed.  It  can  be  converted  to  a  well-posed  one  by 
the  methods  of  regularization  theory^.  We  add  an  ‘energy’  constraint , 
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combine  it  with  the  previous  constraint,  and  minimize 


\^f-S 


2 

2 


where  /?,  the  regularization  parameter,  plays  the  role  of  a  Lagrange  multiplier.  The  solution 
becomes 

f^=(A‘'A+piy'A"g 

or 

f  p  ~  s  > 


where  A*p  =  {a^ A  +  >9/)  ^  A^  is  the  regularized  pseudoinverse  of .4,  by  analogy  with  the 
Moore-Penrose  pseudoinverse^  A^ .  The  optimum  value  for  P  will  depend  on  the  noise  level 
and  can  be  determined  in  one  of  several  ways'*’*. 

A^^  could  be  computed  directly  by  forming  A^A  and  carrying  out  the  required  matrix 

inversion.  This  procedure  loses  accuracy,  however,  and  SVD  is  the  method  of  choice^.  Let 
AhQ  znnx  n  matrix.  (The  formalism  is  easily  extended  to  the  general  mxn  case.)  We 


have 

where 

=  V”V  =  W^  = 

and 

Z  =  diag  . 

Technical  Report  No.  94-3341-01 


Page  5  of  31 


The  singular  values  (crjare  assumed  to  have  been  arranged  in  descending  order  of 
magnitude: 

crj>a-2>...><T„>0. 

Then  we  find  > 

where  =dia 

SVD  is,  however,  computationally  expensive,  involving  O(ION^)  operations  for  an  NxN 
matrix.  A  Matlab  experiment  on  a  66  MHz  80486  PC  produced  the  following  results: 


Time  to 

Output 

Number  of 

.4-Matrix 

compute  SVD 

file 

operations 

512x512 

11m  16s 

6.3  Mb 

1.33  Gflops 

The  .<4 -matrix  associated  with  a  23x23  image  would  be  of  approximately  this  size. 
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2.  Reducing  the  Computational  Burden 


In  certain  cases,  the  computational  effort  can  be  reduced  significantly.  For  example,  if  the 
point-spread  function  is  stationary  across  the  field  of  view  and  unchanging  in  time,  and  if 
the  image  size  does  not  vary,  the  SVD  can  be  precomputed  and  stored. 

The  remaining  (on-line)  computation  is 

fp  =A;g^vrpU^g 

an  0(1^)  procedure,  where  is  the  number  of  pixels  ing^. 

The  matrix-vector  multiplication  effort  can  be  further  reduced  by  exploiting  structure  in  A 
(and  hence  in  ). 

For  stationary  point-spread  function  and  equal  sampling  intervals  in  image  g  and 
reconstruction  /^,  A  is  Toeplitz  (i.e.,  A  =  [%-,])  in  the  one-dimensional  case.  In  two 

dimensions,  image  and  reconstruction  must  be  reconfigured  from  matrices  into  vectors  in 
some  way;  for  example,  by  column-wise  mapping.  A  then  becomes  block  Toeplitz  with 
Toeplitz  blocks,  consisting  of  Toeplitz  submatrices  arranged  within  in  Toeplitz  fashion. 

Figure  1  is  a  Matlab  meshplot  of  an  18  x  18  test  object  consisting  entirely  of  I's  or  O's 
located  at  the  intersections  of  the  grid  lines.  The  8x8  truncated  Gaussian  point-spread 
function  is  shown  in  Figure  2.  The  blurred  image  of  Figure  3  obtained  with  this  point- 
spread  function  then  occupies  25  x  25  pixels;  note  that  no  noise  has  been  added  to  this 
image.  The  625  x  324  imaging  matrix  A,  which  relates  the  column-mapped  object  and 
image  vectors,  is  shown  in  Figure  4  in  perspective  and  in  contour  form  in  Figure  5.  Its 
Toeplitz  symmetries  are  clearly  discernible. 
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Figure  1 
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Figure  2 


8x8  Gaussian  point-spread  function 
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Figure  3 
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Figure  4 
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Figure  6 


deconvolution  matrix  obtained  from  svd  of  A 
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deconuo 1 ut ion  matrix  obtained  from  svd  of  A 
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reconstruction  of 
18x18  object  uia  sud 
unregular i zed 


3.  Exploiting  Symmetries 


Symmetries  in  ,  deriving  from  the  Toeplitz  properties  of  A,  are  clearly  evident  in 
Figure  6  and  7.  Such  Toeplitz  structure  can  be  exploited  using  the  concept  of  displacement 
rank’.  The  displacement  rank  of  a  general  matrix  B  is  the  smallest  number  a  for  which  we 
can  write 


B^iAx,)u{y,)  . 

k=l 

where  L  and  U  are  lower  and  upper  and  triangular  Toeplitz  matrices.  The  and  y*  can 
be  obtmned  by  SVD  of  the  'displaced  difference'  B-ZBZ^  of  the  matrix,  where  Z  is  the 
downshift  matrix 


0  0  0  ...0  0 

1  0  0  ...0  0 

0  1  0  ...0  0 

_0  0  0  ...1  0  . 

For  ,  we  expand  the  SVD  of  the  displaced  difference  in  the  following  form*: 

a;  -zA;z^=is,p,q^, 

where  the  and  are  the  left  and  right  orthonormal  singular  vectors  and  the  singular 
values  are  indexed  in  decreasing  order.  This  expression  leads  to  the  following 
displacement-rank  expansion  for  A^  : 
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Here  L(p^)  denotes  a  lower  Toeplitz  matrix  with  first  column  ,  and  {q^)  an  upper 
Toeplitz  matrix  with  first  row  q^. 

This  expansion  enables  the  matrix-vector  product  A}g  to  be  performed  by  FFT  methods, 
thus  conferring  a  potential  gain  in  speed  and  storage  requirements.  However,  an  expensive 
SVD  is  required  to  find  the  p^.  and  and  in  practice  a  may  not  be  an  especially  small 
number. 


We  can  achieve  much  greater  computational  efficiency,  and  also  develop  a  procedure  with 
wide  applicability,  by  keeping  the  pixel  sizes  in  the  image  and  reconstruction  spaces  equal 
and  by  allowing  the  support  of  the  reconstruction  to  expand  to  that  of  the  image. 


A  can  then  be  made  a  circulant^  in  which  the  rows  and  columns  are  cyclical,  and  in  which 
any  one  column  (or  row)  contains  all  the  information  needed  to  construct  the  complete 

matrix  yl.  The  elements  of  the  circulant  of  order  n  are  related  by  the  equation  A  =  [ay-,+i]  , 

where  the  subscript  is  taken  modulo  n.  Its  key  property  for  the  present  purpose  is  that  the 
Fourier  transform  diagonalizes  a  circulant.  We  can  write 


where 


A  =  F^KF 


1 

1 

1 

... 

1 

1 

w 

•  •• 

w"-‘ 

1 

... 

^2(n-l) 

,  w  =  exp(/2;r  /  n) 

1 

... 
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and  K  =  diag[\,....,X^  . 


Then  from 


/,  +  /?/)- Vg 


we  find 


fp  =F^k\Fg 


where 


PCp  =dia^ 


JL* 


>1 

J 


We  now  need  only  1-D  FFTs.  The  X  ^  are  given  by  the  FFT  of  the  first  column  of  A  and 
the  reconstructed  image  is  computed  as 


ffi 


which  is  an  0(3n)  logj  n  procedure. 

The  penalty  is  that  the  support  of  has  been  relaxed  -  noise  can  leak  into  'zero'  areas.  For 

images  much  larger  than  the  effective  point-spread  function,  however,  this  is  not  a  serious 
problem. 

Figures  9  and  10  show  the  circulant  form  of  the  imaging  matrix  for  the  same  point-spread 
function  and  image  size  as  in  the  earlier  example,  and  should  be  compared  -with  Figures  4 
and  5.  The  reconstruction  now  expands  to  25  x  25  pixels.  It  can  easily  be  shown  that  the 
deconvolution  matrix  ,  displayed  in  Figures  1 1  and  12,  is  also  a  circulant,  although  much 

less  sparse.  In  the  noiseless  case,  the  reconstruction  shown  in  Figure  13  is  obtained  with 
>9=0.  When  Gaussian  noise  is  added  to  the  image  with  a  standard  deviation  of  lO"^ 
relative  to  the  local  pixel  value,  the  reconstruction  is  as  shown  in  Figure  14.  At  a  noise 
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level  of  10'^,  the  object  is  no  longer  recognizable;  see  Figure  15.  By  setting  the 
regularization  parameter  equal  to  10’'®,  however,  the  image  is  acceptably  restored,  as  can  be 
seen  in  Figure  16.  Note  also  that  the  leakage  of  energy  outside  the  true  object  support  is 
now  very  limited. 
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Figure  9 
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Figure  13 
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Figure  14 
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Figure  16 
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For  a  space-variant  point-spread  function,  direct  expansion  of  A  to  circulant  form  is  no 
longer  possible.  However,  if  variation  across  the  image  is  small,  a  displacement-rank 
expansion  of  into  sums  of  products  of  upper  triangular  Toeplitz  and  circulant  matrices 

can  be  made^®; 


The  and  are  now  obtained  by  SVD  of  the  cyclic  displacement  of  A^p : 

a^p-ea;e^ 

where  E  is  the  cyclic  downshift  matrix 


00  0  ...  0  1 

10  0  ...  0  0 

0  1  0  ...  0  0 

0  0  0  ...  1  0. 


Once  again,  however,  the  computationally  intensive  SVD  is  required.  In  addition,  a  is  not 
necessarily  a  small  number. 
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4.  Conclusions 


For  space-invariant  point-spread  functions  and  equal  sampling  in  image  and  reconstruction 
spaces,  expansion  of  the  imaging  matrix  A  to  circulant  form  is  straight-forward,  providing 
immediate  access  to  one-dimensional  FFT  construction  techniques  and  greatly  enhanced 
efficiency  through  reduced  computational  effort  and  storage  requirements.  The  matrix 
representing  the  image  is  vectorized  by  a  simple  column-wise  (or  row-wise)  mapping. 

The  penalty  of  a  circulant  expansion  is  loss  of  a  smaller  support  in  reconstruction  space. 
However,  for  typical  images,  using,  for  example,  a  CCD  array,  where  the  point-spread 
fiinction  is  generally  of  much  smaller  extent  than  the  image,  this  is  not  a  serious  limitation. 

If  the  point-spread  function  is  not  space-invariant,  or  if  sampling  rates  differ  in  image  and 
reconstruction,  the  SVD  is  the  preferred  means  for  synthesis  of  the  deconvolution  matrix. 
For  this  case,  faster  SVD  procedures  would  be  a  valuable  development.  If  the  accuracy  and 
reliability  afforded  by  the  SVD  are  not  of  paramount  importance,  fast  approximate  methods, 
based,  for  example,  on  a  separable  approximation  to  the  point-spread  function,  may  be 
adequate. 
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Superresolution  Algorithms  for  a  Modified  Hopfield 

Neural  Network 

John  B.  Abbiss,  Bryan  J.  Brames,  and  M.  A.  Fiddy,  Member,  IEEE 


Abstract— purpose  of  this  paper  is  to  describe  the  imple¬ 
mentation  of  a  superresolution  (or  spectral  extrapolation)  pro¬ 
cedure  on  a  neural  network,  based  on  the  Hopfield  model.  This 
was  first  proposed  by  Abbiss  et  aL  [1],  We  show  the  computa¬ 
tional  advantages  and  disadvantages  of  such  an  approach  for 
different  coding  schemes  and  for  networks  consisting  of  very 
simple  two  state  elements  as  well  as  those  made  up  of  more 
complex  nodes  capable  of  representing  a  continuum.  With  the 
appropriate  hardware,  we  show  that  there  is  a  computational 
advantage  in  using  the  Hopfield  architecture  over  some  alter¬ 
native  methods  for  computing  the  same  solution.  We  also  dis¬ 
cuss  the  relationship  between  a  particular  mode  of  operation  of 
the  neural  network  and  the  regularized  Gerchberg-Papoulis 
algorithm. 

1.  Introduction 

HERE  are  several  models  of  neural  networks,  each  of 
which  has  a  structure  based  loosely  on  our  view  of 
biological  nervous  system  components  [2] .  A  neural  net¬ 
work  architecture  is  one  consisting  of  a  very  large  number 
of  simple  processing  elements  densely  interconnected  by 
a  set  of  weighted  links.  Each  processing  element  updates 
its  state  by  comparing  the  sum  of  its  inputs  with  a  pre¬ 
scribed  threshold.  The  study  of  the  properties  of  neural 
networks  is  a  subject  still  somewhat  in  its  infancy,  and 
current  hardware  limitations  reduce  their  practical  im¬ 
pact.  Indeed,  it  has  been  suggested  by  Anderson  and  Ro- 
senfeld  [3]  that  they  may  not  become  useful  until  inex¬ 
pensive  special  purpose  parallel  hardware  is  available. 
Should  that  hardware  be  available,  the  question  remains 
as  to  how  one  would  make  best  use  of  a  neural  computer; 
i.e.,  how  one  should  program  or  “train”  it  to  perform  the 
tasks  required.  The  hope  is  that  some  problems  for  which 
it  is  difficult  to  find  satisfactory  algorithmic  solutions 
might  be  amenable  to  this  kind  of  computing  architecture, 
which  can  organize  itself  and  learn  what  it  is  expected  to 
accomplish. 

One  anticipated  use  of  neural  networks  is  in  autoasso- 
ciative  memory  and  in  image  (or  signal)  classification, 
recognition  or  understanding  applications;  these  are  ap¬ 
plications  that  we  believe  the  human  brain  is  particularly 
good  at  while  current  algorithms  implemented  largely  on 
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serial  machines  still  leave  much  to  be  desired.  For  most 
neural  networks,  their  learning  or  restoration  capabilities 
can  be  expressed  in  terms  of  the  minimization  of  some 
appropriate  energy  or  cost  function.  One  of  the  objectives 
of  this  paper  is  to  take  an  established  algorithm  in  image 
reconstruction  and  identify  those  aspects  of  it  that  can  be 
related  to  the  programming  requirements  that  would  be 
necessary  for  implementation  on  a  Hopfield  neural  com¬ 
puter.  From  the  analysis  of  such  a  network  applied  to  solve 
a  problem  for  which  the  cost  function  is  well  defined,  one 
might  be  able  to  assess  their  use  for  the  solution  of  a  wider 
class  of  optimization  problems. 

The  Hopfield  network  is  a  fully  connected  network  in 
the  sense  that  any  one  of  the  processing  elements  is  con¬ 
nected  to  every  other  one.  This  contrasts  with  layered  net¬ 
works,  such  as  a  multilayered  perceptron  (MLP),  in  which 
processing  elements  are  arranged  with  connections  only 
between  neighboring  layers.  This  difference  in  topology 
is  accompanied  by  differences  in  the  thresholding  func¬ 
tions  and  in  the  procedures  to  find  the  connection 
strengths.  The  Hopfield  network  operates  iteratively;  the 
connection  strengths  are  assigned  and  specify  a  cost  func¬ 
tion  which  the  iterative  procedure  minimizes.  The  MLP 
is  a  one-pass  network  once  the  connection  strengths  have 
been  “learned”  by  the  minimization  of  an  error  function 
which  quantifies  the  difference  between  the  current  and 
desired  output  states. 

It  was  pointed  out  by  Jau  et  al.  [4]  that  some  iterating 
image  restoration  processes  are  mathematically  very  sim¬ 
ilar  to  autoassociative  memory;  indeed  if  the  input  infor¬ 
mation  is  incomplete,  it  can  be  considered  as  a  key  pattern 
to  an  associative  memory.  Since  the  approach  to  image 
restoration  presented  here  was  first  proposed  [1],  there 
have  been  other  related  studies  which  we  mention  here. 
Zhou  et  al.  [5]  considered  an  energy  function  identical  to 
(6)  in  order  to  specify  network  interconnection  strengths. 
Their  application  was  the  restoration  of  grey  level  images 
degraded  by  a  shift  invariant  FIR  blur  function  and  addi¬ 
tive  noise.  Grey  level  information  was  coded  by  a  simple 
sum  of  binary  elements  and  the  network  was  serially  up¬ 
dated  with  a  stochastic  thresholding  rule  to  avoid  getting 
trapped  in  local  minima  of  the  energy  function.  Jang  et 
al  [6]  utilized  the  optimization  properties  of  the  Hopfield 
network  in  order  to  estimate  a  matrix  inverse.  This  is 
clearly  important  in  image  restoration  problems,  as  indi¬ 
cated  by  (8).  Full  grey  level  representation  is  assumed  and 
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a  differential  mode  of  implementation  applied;  they  point 
out  that  this  method  is  similar  to  a  steepest  descent  method 
but  with  a  nonlinear  thresholding  step  at  each  iteration. 
Bai  and  Farhat  [7]  also  considered  a  cost  function  similar 
to  that  in  (6)  to  recover  images  from  limited  Fourier  data. 
In  their  approach  there  was  an  additional  constraint  on  the 
norm  of  the  derivative  of  the  estimate  as  well  as  on  the 
norm  of  the  estimate  itself.  The  increment  added  to  the 
current  estimate  was  weighted  by  a  “gain”  factor  prior 
to  thresholding,  which  was  chosen  to  ensure  that  the  net¬ 
work  energy  function  decreased  at  each  step.  Their  recon¬ 
structions  showed  advantages  for  low-signal-to-noise  ra¬ 
tio  situations  and  are  being  implemented  on  optoelec¬ 
tronic  hardware.  Winters  [8]  considers  a  norm  minimi¬ 
zation  as  expressed  by  (6)  but  without  any  explicit  regu¬ 
larization  term  included.  The  minimization  of  his  cost 
function  is  achieved  by  the  penalty  method  which  requires 
that  a  large  positive  value  is  added  to  the  cost  function, 
wherever  a  nonlinear  inequality  constraint  is  not  satisfied. 
An  adaptive  penalty  function  allows  one  to  avoid  local 
minima  and  this  complete  procedure  can  be  mapped  onto 
the  Hopfield  energy  function.  Results  showed  that  the  re¬ 
constructions  were  robust  against  noise  and  could  be  im¬ 
plemented  in  microseconds  on  an  analog  electronic  net¬ 
work,  as  compared  to  several  hours  on  a  minicomputer. 

Using  a  Hopfield  network,  our  interest  is  in  an  appli¬ 
cation  for  which  a  solution  state  evolves  through  the  min¬ 
imization  of  some  specific  cost  or  energy  function.  Once 
the  energy  function  is  defined,  one  can  determine  the  ap¬ 
propriate  connection  strengths  in  order  that  the  energy 
function  associated  with  the  network  is  the  same  as  that 
of  the  problem  under  consideration.  A  key  feature  of  a  net 
of  this  type  is  its  construction  from  a  set  of  simple  pro¬ 
cessors,  each  of  whose  states  is  determined  by  a  thresh¬ 
olding  operation  applied  to  a  sum  of  weighted  inputs  from 
other  processors  or  nodes.  The  properties  of  the  network 
as  a  whole  are  determined  by  the  thresholding  function 
used,  and  by  the  pattern  and  strengths  of  the  connections 
between  the  processing  elements. 


II.  The  Hopfield  Network  with  Two  State 
Elements:  Theoretical  Framework 

The  Hopfield  neural  model  [9],  [10]  allows  one  to  spec¬ 
ify  a  set  of  desired  memories  as  minima  of  a  configura¬ 
tional  energy  of  the  network.  We  assume  that  the  network 
consists  of  N  processing  elements  each  of  which  has  two 
states  and  each  of  which  has  a  thresholding  operator  that 
determines  the  states  of  the  element  from  the  total  input 
to  that  element. 

Given  an  initial  starting  configuration  or  state  of  the 
network,  each  processor  or  “neuron”  updates  its  state  ac¬ 
cording  to  a  threshold  rule  of  the  form 

N 

if  S  TiiVj  >  0  then  1;  otherwise  =  -I  (1) 

j=] 


where  the  elements  of  the  connection  matrix  T  are  formed 
according  to  the  rule 

M 

Ty=Zvfv^  HJ  =  \,2  ■  ■  ■  N)  (2) 

1 

and  the  vf  are  the  elements  of  the  M  memory  vectors,  v^, 
to  be  stored.  The  can  take  the  values  +1,  and  it  is 
assumed  that  the  diagonal  term,  r„,  is  zero  in  the  Hopfield 
model. 

This  thresholding  rule  can  be  applied  in  series  (asyn¬ 
chronously)  or  in  parallel  (synchronously).  In  the  serial 
mode,  the  rule  is  applied  sequentially  to  the  nodes  of  the 
network,  the  state  of  the  network  being  updated  after  each 
operation.  In  the  parallel  mode,  the  current  network  state 
remains  unchanged  until  the  thresholding  operation  has 
been  applied  to  every  node.  The  configurational  energy 
function  for  the  network  has  the  form  [9],  [10] 

N  N 

E  =  TijV,Vj=  -\v^Tv  (3) 

/•  =  1  7  =  I 

where  superscript  T  denotes  transpose.  Serial  threshold¬ 
ing  will  always  minimize  this  energy  function,  provided 
that  the  Ta  are  nonnegative.  If  M  is  sufficiently  small,  this 
state  will  correspond  to  the  memory  closest  in  Hamming 
distance  to  the  state  in  which  the  network  was  started. 
Parallel  thresholding  results  in  either  convergence  to  a 
stable  state  or  oscillation  between  two  states  [11]. 

This  iterative  scheme  can  be  expressed  more  concisely 
and  modified  to  permit  biasing  of  the  neuron  inputs  in  the 
following  manner.  We  let  the  state  of  the  network  after 
the  nth  iteration  be  described  by  the  N  element  vector 

^(«  +  i)  =  U(Tv^'<^  +  b) 

where  JJ  is  the  threshold  operation,  T  denotes  the  matrix 
with  elements  7]y,  a  superscript  denotes  iteration  number, 
and  b  is  the  bias  vector.  The  bias  vector  incorporates 
boundary  conditions  such  as  image  data;  it  effectively 
shifts  the  decision  threshold  for  each  element.  In  this  case 
the  energy  function  minimized  by  the  network  is  of  the 
form  [10] 

E  =  -Xjlv'^Tv  -  b^v.  (4) 

The  change  in  energy  for  a  change  in  the  state  of  one 
neural  element  from  to  f ^  -k  A  is 

=  -^Vk[(Tv  +  b)k  +  j  TkkAvi,]. 

Taking  to  be  zero  ensures  that  the  change  in  energy  is 
always  negative,  since  the  term  in  the  brackets  above  then 
has  the  same  sign  as  At'^..  If  is  nonzero,  the  term  in 
the  bracket  will  have  the  same  sign  as  Aj.  provided  T).^.  is 
positive,  and  then  E  is  guaranteed  to  reduce;  the  conse¬ 
quences  of  varying  were  explored  in  an  earlier  publi¬ 
cation  and  verify  the  expected  behavior  [12].  If  two  (or 
more)  neurons  change  state  simultaneously,  the  change  in 
E  contains  terms  involving  products  of  the  form  —1/2 
7).,  A  Vi,Avi  (or  these  plus  higher  order  terms  if  more  neu¬ 
rons  change),  the  sign  of  which  can  vary. 
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The  two  state  representation  is  too  limited  for  a  one-to- 
one  mapping  between  elements  and  signal  or  image  sam¬ 
ples,  in  most  cases.  However,  these  simple  elements  can 
be  taken  in  groups  to  represent  grey  levels  through  a  va¬ 
riety  of  coding  schemes.  Alternatively,  either  analog  or 
more  complex  digital  processing  elements  could  be  used 
to  directly  represent  a  grey  level. 

III.  The  Superresolution  Problem 
There  are  many  applications  that  require  the  restoration 
of  a  signal  or  image  from  a  limited  discrete  data  set;  for 
example,  samples  of  the  spectrum  of  a  function,  or  of  its 
low  or  band-pass  filtered  image.  An  important  a  priori 
assumption  for  work  in  super  resolution  is  the  fact  that 
most  objects  to  be  imaged  are  of  compact  support.  This 
leads  to  the  well-known  result  that  their  spectra  are  band- 
limited  functions.  In  principle,  therefore,  one  might  hope 
to  extend  limited  spectral  data  by  means  of  analytic  con¬ 
tinuation.  This  procedure  is  notoriously  unstable  in  the 
presence  of  noise  and  does  not  provide  a  practical  solution 
to  the  problem.  One  has  infinite  freedom  in  interpolating 
and  extrapolating  limited  sampled  data;  hence,  one  is 
forced  to  approach  super  resolution  from  an  optimization 
point  of  view  [13].  The  best  that  one  can  hope  to  achieve 
is  the  specification  of  a  cost  or  energy  function  which  pos¬ 
sesses  a  unique  minimum  and  is  designed  to  incorporate 
whatever  constraints  and  a  priori  knowledge  might  be 
available  to  help  limit  the  set  of  possible  solutions  to  the 
problem,  while  retaining  desirable  and  necessary  solution 
characteristics.  Examples  of  constraints  include  data  con¬ 
sistency,  support  consistency  and,  perhaps,  positivity. 
The  objective  of  the  superresolution  process  is  to  obtain 
a  final  image  that  has  a  higher  spectral  or  spatial  fre¬ 
quency  content  than  the  original  data  set,  as  a  direct  con¬ 
sequence  of  incorporating  the  prior  knowledge  available 
into  the  cost  function.  It  is  a  matter  of  taste,  to  a  large 
extent,  how  one  designs  a  cost  function  in  order  to  obtain 
a  desirable  solution  to  the  problem;  i.e.,  a  superresolved 
signal  or  image  with  acceptable  properties.  The  super¬ 
resolution  problem  is  thus  transformed  into  one  of  deter¬ 
mining  the  (global)  extremum  of  a  cost  function  on  the 
assumption  that  this  solution  is  optimum. 

One  of  the  early  successes  of  a  neural  network  was  to 
find  a  good  approximation  to  the  traveling  salesman  op¬ 
timization  problem  [14].  The  superresolution  optimiza¬ 
tion  problem  can  be  mapped  onto  a  neural  network  in  two 
distinct  ways.  One  is  to  train  network  using  a  data  base 
of  superresolved  images  [15]-[17],  the  other  is  to  relate 
the  cost  function  associated  with  a  given  network  to  the 
chosen  superresolution  cost  function.  It  is  the  latter  ap¬ 
proach  that  we  adopt  here. 

IV.  Superresolution  and  Spectral  Estimation 

Most  signal  or  image  recovery  problems  can  be  de¬ 
scribed  by  linear  equations  of  the  form 

g(x)  =  I  A{x,  y)fiy)  dy 


where  A  is  the  system  spread  function  or  the  Fourier  trans¬ 
form  kernel,  for  example.  The  interpretation  of  the  data 
g(x)  to  obtain  information  about  the  object /(y)  requires 
the  solution  of  a  linear  inverse  problem.  This  is  equivalent 
to  finding  the  solution  of  a  Fredholm  integral  equation  of 
the  first  kind.  It  is  well  known  that  small  fluctuations  in 
the  data  g{x)  can  lead  to  very  large  fluctuations  in  the  es¬ 
timate  of  the  unknown  function /(y).  This  is  a  manifes¬ 
tation  of  the  ill-posed  nature  of  the  problem  (the  inverse 
of  the  operator  A  is  not  generally  continuous)  and  some 
degree  of  regularization  is  required  in  order  to  determine 
stable  and  meaningful  solutions.  One  usually  proceeds  by 
assuming  that  the  desired  solution  belongs  to  the  space  F 
of  (possibly  weighted)  functions,  the  regularization  re¬ 
stricting  that  solution  to  conform  to  any  a  priori  knowl¬ 
edge  available  about  the  object  whose  enhanced  image  is 
sought. 

In  practice,  the  solution  is  determined  from  a  finite  set 
of  samples  of  g(jt),  and  the  data  vector  g  is  expressed  by 

g  =  Af  +  n 

where  A  is  the  imaging  operator  and  n  represents  an  ad¬ 
ditive  noise  component;  A  explicitly  contains  the  support 
constraint  of  /,  which  is  assumed  to  be  known  or  esti¬ 
mated  a  priori.  These  limited  data  can  be  regarded  as  a 
noisy  finite  set  of  bounded  linear  functionals  off. 

A  data-consistent  solution  exists,  however,  which  is  a 
solution  of  minimum  norm.  This  solution  is  the  data-con¬ 
sistent  ^  which  minimizes  ll  g|l^,  where  1|  1|  denotes  norm. 

The  solution  to  this  minimization  problem  can  be  writ¬ 
ten 

N 

J  =  S  (g,  Vi)Ui/ai  (5) 

i  =  1 

where  N  is  the  number  of  image  data  points  and  the  «/,  w, 
and  Vi  are  the  singular  values,  singular  functions,  and  sin¬ 
gular  vectors,  respectively,  pertaining  to  the  operator  A: 
Aui  =  aiVi\  A^Vi  =  Qi/W/.  The  singular  values  tend  to  zero 
as  i  increases,  leading  to  an  instability  in  the  estimator.  If 
the  first  Ni  <  N  singular  values  are  dominant,  then  the 
remainder  may  be  neglected,  but  only  at  the  expense  of 
loss  of  resolution  in  jg. 

Thus,  this  solution  is  ill-conditioned  but  stability  can 
be  restored  by  relaxing  data  consistency;  hence,  we  min¬ 
imize  the  cost  function 

E  =  Ul,'  -  gf  +  (6) 

The  estimate  is 

N 

g  =  S  (g,  Vi)Uiai/(aj  -E  /3)  (7) 

i  =  1 

where  the  regularization  parameter  /3  is  chosen  to  achieve 
a  compromise  between  resolution  and  stability,  and  usu¬ 
ally  requires  some  adjustment  in  order  to  establish  its  op¬ 
timal  value.  As  |3  tends  to  zero,  the  solution  becomes  more 
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data  consistent.  The  minimizer  of  this  cost  function  can 
be  computed  directly  in  matrix  form,  namely, 

g  =  [A*  A  +  0ir'A*g  .  (8) 

where,  for  a  real- valued  matrix,  yl*  becomes  A^,  and  I 
represents  the  identity  matrix. 

An  alternative  approach  to  estimating  the  object  is  to 
consider  the  minimization  of  the  cost  function  ||/  -  |p 
using  a  trigonometric  polynomial  of  the  form  [13] 

N 

€  =  ,2 

k  ~  1 

where  the  0^^  form  a  basis  in  the  data  space  F  and  the 
optimal  dj^  satisfy 

2  4>n)F  +  =  G„  (9) 

m  =  1 

and  the  are  Fourier  data  corresponding  to  the  low-pass 
filtered  image  g. 

It  is  worth  pointing  out  that  in  the  space  F  that  incor¬ 
porates  the  known  support  constraint  for  the  function  to 
be  restored,  the  three  solutions  given  by  (7)-(9)  are  equiv¬ 
alent;  expression  (9)  can  be  obtained  from  expression  (8) 
[18].  Each  method  for  solution  is  more  or  less  computa¬ 
tionally  the  same  in  that  each  requires  —  0(N^)  multipli¬ 
cations;  this  was  pointed  out  earlier  in  [1]. 

We  note  that  expression  (7)  requires  on  the  order  of 
CN^  multiplications,  where  the  overhead  C  is  large  by 
comparison  with  the  other  methods.  However,  a  primary 
concern  is  the  ease  with  which  the  regularization  param¬ 
eter  /3  can  be  varied;  this  can  be  done  at  the  cost  of 
multiplications  for  (7). 

V.  Implementation  on  a  Neural  Network 

We  will  now  show  how  a  superresolution  algorithm 
equivalent  to  the  previously  described  approaches  can  be 
defined  on  a  Hopfield  neural  network.  Several  issues  must 
be  addressed.  First,  it  is  necessary  to  define  the  connec¬ 
tion  matrix  from  the  cost  function.  For  some  problems 
this  cannot  be  accomplished  without  performing  more 
calculations  than  are  required  for  a  more  conventional  so¬ 
lution  to  the  problem.  The  latter  consideration  was  noted 
by  Takeda  and  Goodman  [19].  Thus,  the  computational 
load  or  complexity  must  be  considered  in  deciding  the 
merits  of  a  neural  network  solution  to  this  problem. 

Other  issues  which  must  be  addressed  center  on  modi¬ 
fications  of  Hopfield ’s  formulation  to  satisfy  our  require¬ 
ments.  Hopfield  ensures  that  the  network  will  converge 
by  arbitrarily  setting  the  diagonal  of  the  connection  matrix 
to  zero.  This  is  unacceptable,  because  it  shifts  the  abso¬ 
lute  energy  minimum  of  the  network  from  the  minimum 
of  (6).  Moreover,  while  a  suitable  network  can  be  con¬ 
structed  from  two-state  elements,  one  often  requires  that 
the  reconstruction  be  represented  over  at  least  a  set  of  grey 
levels.  Thus,  it  may  be  necessary  to  combine  a  number  of 
neural  elements  to  represent  a  reconstruction  pixel.  The 
method  of  coding  the  grey  levels  depends  on  the  com¬ 


plexity  of  the  elements,  i.e.,  whether  they  can  only  take 
on  two  states,  a  bounded  continuum  of  values  such  as  0 
<  <  1  (“graded  neurons”),  or  an  (elfectively)  un¬ 

bounded  continuum.  Granularity  of  the  representation  will 
affect  the  convergence  properties  of  the  network;  coarsely 
quantized  systems  can  converge  to  local  energy  minima, 
yielding  less  than  optimal  reconstructions. 

We  shall  first  demonstrate  the  formal  mapping  of  a 
superresolution  algorithm  onto  a  Hopfield  network.  The 
technique  will  then  be  extended  to  fully  address  the  sec¬ 
ond  and  third  issues  on  a  two-state  network.  We  will 
briefly  examine  the  advantages  of  more  finely  quantized 
systems,  and  finally  discuss  the  relation  between  parallel 
thresholding  and  a  regularized  form  of  the  well-known 
Gerchberg-Papoulis  spectral  extrapolation  algorithm 
[20]-[22]. 

Let  us  represent  the  current  estimate  by  the  state  of  the 
network  v.  We  can  rewrite  (6)  as 

E  =  v^A^Av  —  Iv^A^g  +  g^g  -f  jSv^v. 

Comparing  this  expression  for  E  with  that  of  the  Hopfield 
network,  (4),  gives 

T  =  -2{A^A  +  /3/)^ 

b  =  lA^g  J  (10) 

where  /  is  the  identity  matrix,  and  the  g^g  term  can  be 
ignored,  since  it  represents  a  total  offset  for  E. 

Thus,  superresolution  can  be  mapped  simply  and  di¬ 
rectly  onto  a  Hopfield  network.  The  connection  matrix  is 
formed  from  the  imaging  operator  matrix,  which  contrib¬ 
utes  information  about  the  imaging  system,  and  the 
regularization  parameter  /3,  which  sets  a  bound  on  the 
norm  of  the  final  estimate.  The  available  data  g  contribute 
only  to  the  bias  vector  b. 

For  serial  operation,  the  change  in  energy  due  to  a 
change  A  Vj,  is 

=  -Av,[{Tv  +  b\A\  (11) 

Convergence  to  a  minimum  is  guaranteed  if  the  expres¬ 
sion  in  braces  always  has  the  same  sign  as  A  v^.  This  can 
be  ensured  by  altering  the  diagonal  of  T to  zero;  however, 
such  a  change  is  equivalent  to  choosing  an  arbitrary  value 
for  0.  This  is  not  acceptable,. since  the  regularization  pa¬ 
rameter  should  be  chosen  to  reflect  the  noise  in  the  data 
and  to  obtain  an  optimum  reconstruction,  not  to  ensure 
convergence  of  the  algorithm. 

VI.  Superresolution  on  a  Two-State  Network 

In  this  section  we  will  modify  the  Hopfield  formulation 
so  that  the  energy  minimum  of  the  network  will  coincide 
with  that  of  (6),  while  still  decreasing  the  energy  with 
each  change  in  the  state  vector.  We  shall  find  that  this  is 
possible  by  introducing  a  two-level  threshold  in  place  of 
the  usual  single-level  one.  In  addition,  we  will  incorpo¬ 
rate  a  generalized  grey  scale  mapping  which  describes  lin¬ 
ear  transformations  of  a  state  vector  v  into  a  vector  w  hav¬ 
ing  grey  levels.  We  write  this  as  w  —  Sv,  where  5  could 
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be  a  mapping  from  an  N  element  vector,  each  of  whose 
elements  can  take  the  values  0  or  1 ,  to  an  L-element  vec¬ 
tor  (L  <  A^)  whose  elements  can  take  a  wider  range  of 
values.  For  example,  if  S  represents  a  base-2  mapping, 
each  element  of  v  can  represent  a  power  of  2,  giving  a 
range  of  2^/^  values  for  each  element  of  w.  The  range  of 
V  need  not  be  limited  to  {0,  1}:  we  use  this  for  the  pur¬ 
pose  of  illustration.  Other  coding  schemes  are  possible, 
such  as  clustering  or  bit-density  coding. 

The  expression  for  the  energy  is  now 

E  =  \\Aw  -  gf  +  /3||w|p. 

Suppose  the  grey-level  vector  w  is  perturbed  by  some 
amount  Aw.  The  difference  in  energy  between  states  is 

AE  =  2Aw'^{(A^A  +  ^I)w  -  A^g  +  ^  (A^A  +  /3/)Aw} 

(12) 

or,  in  terms  of  the  neural  state  vector  v, 

AE  =  2Av^[S\a'^A  +  ^I)Sv  -  S^A^g 

+  ^S\A^A  +  ^I)SAv].  (13) 

It  should  be  emphasized  that  (13)  contains  no  assump¬ 
tion  about  the  range  of  values  of  v  or  of  the  updating  mode 
of  the  network.  A  restriction  to  two  states  reflects  the  de¬ 
sire  to  use  a  large  number  of  simple  binary  processing 
elements  in  neural  architectures. 

We  now  present  a  procedure  which  ensures  that  the 
change  in  energy  expressed  by  (12)  and  (13)  always  de¬ 
creases,  provided  serial  thresholding  is  adopted.  For  a 
change  Avj^,  the  change  in  energy  AEf^  of  the  network  is 
given  by  (11),  with  the  following  definitions  for  Tand  b: 

T  =  -2S\a'^A  +  /3/)5 

b  =  2S’^A^g. 

The  grey-scale  mappings  we  are  considering  associate 
a  specific  neuron  with  one  and  only  one  image  pixel. 
Hence,  the  columns  of  S  each  contain  only  one  element, 
and  it  is  not  difficult  to  show  that  the  diagonal  elements 
of  T  take  the  form 

Ttt  =  -2Sj,(A^A  +  ^Djj  (14) 

where  Sj^.  is  the  nonzero  element  of  the  ^th  column  of  S. 
Since  the  diagonal  elements  of  A^A  are  positive,  and  13  is 
some  positive  quantity,  is  always  negative.  Hence,  we 
can  rewrite  (13)  in  the  form 

AE,  =  -Av,{(Tv  +  fc)*  -  i  \Ta\Av,}.  (15) 

Thus,  AE,  will  be  negative  provided  (At'*.)^  <  A,Av,, 
where 

=  Tf“r  (Tv  +  b), 

the  maximum  decrease  in  energy  occurring  when 

Av,  =  ^A,.  (16) 
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Thus,  we  require  that  |  Ai^*|  <  |A*|  and  sgn  (Ai^^^)  =  sgn 
(Ai).  For  a  binary  network,  where  v,  e  (0,  1},  we  then 
obtain  the  following  rule  to  ensure  that  the  network  en¬ 
ergy  does  not  increase: 

FI  for  A*  >  1 

=  vP  for  I  A, I  <  1 

Co  for  <  “i- 

We  consider  next  the  operation  of  a  nonbinary  network. 

VII,  Superresolution  on  a  Nonbinary  Network 

The  restriction  of  the  state  vector  v  to  binary  values 
permitted  the  simplest  possible  processing  elements  to  be 
used  in  a  neural  architecture.  With  more  complex  proces¬ 
sors  this  simple  representation  is  unnecessary  and  ineffi¬ 
cient:  the  optimal  coding  scheme  is  intimately  related  to 
the  nature  of  the  available  hardware. 

A  disadvantage  of  simple  two-level  elements  is  that  they 
can  give  a  coarsely  quantized  representation  in  recon¬ 
struction  space  which  leads  to  the  creation  of  local  energy 
minima.  There  is  still  only  one  absolute  energy  minimum, 
but  the  network  may  converge  to  a  local  minimum  of 
higher  energy.  It  should  be  recognized  that  this  behavior 
also  occurs  with  a  single  level  threshold  and  a  zero- 
diagonal  T  matrix,  and  the  resulting  reconstructions  are 
sometimes  called  “spurious  stored  states.”  A  typical  so¬ 
lution  of  this  type  is  shown  in  Fig.  1(c)  for  a  network  of 
90  two-level  elements;  a  6-b  coding  scheme  yields  15 
points  in  the  reconstruction.  Whether  such  a  reconstruc¬ 
tion  is  of  acceptable  quality  is  difficult  to  predict,  and  a 
function  of  the  needs  of  the  user.  This  difficulty  can  be 
overcome  by  using  elements  which  can  take  on  values  over 
a  continuum,  such  as  0  <  i;;.  :<  1  (Fig.  1(e),  (f)).  These 
elements  are  similar  to  the  graded  neurons  employed  by 
Hopfield  in  a  differential  network  [10]. 

We  shall  now  examine  the  behavior  of  an  asynchronous 
network  composed  of  elements  which  can  take  on  a  con¬ 
tinuum  of  values.  Because  |  A  is  no  longer  fixed  there 
is  no  need  for  a  threshold/decision  operator;  we  will  sim¬ 
ply  use  the  value  of  Av,,  which  yields  the  greatest  de¬ 
crease  in  energy. 

It  was  noted  above  that  the  maximum  decrease  in  en¬ 
ergy  occurs  when 

Av,  =  A,/2. 

In  addition,  |  A^|  represents  an  upper  bound  on  \  A  v,\.  As 
one  approaches  the  solution,  {Tv  H-  b)f^  approaches  zero, 
so  this  upper  bound  decreases.  For  networks  with  a  fixed 
\Av,\  (e.g.,  ±1  for  all  k),  one  would  expect  \Ak\  even¬ 
tually  to  be  smaller  than  |  A  ,  so  no  changes  can  be  made 
to  reduce  the  energy,  even  though  the  network  is  not  yet 
at  the  global  minimum.  Thus,  the  fixed  step  methods  will 
generally  be  limited  to  some  outer  neighborhood  of  the 
global  energy  minimum  (although  one  might  arrive  at  the 
minimum). 
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Fig.  1 .  The  line  types  used  are  solid  for  an  image  or  a  neural  network  reconstruction;  dashed  tor  the  object;  and  dot-dashed  for 
the  algebraic  (SVD)  reconstruction,  (a)  Object  and  incoherent  image.  Ninety  two-state  {0,  1 }  elements  are  mapped  with  a  6-b 
base  2  scheme  to  15  pixels  in  the  object,  (b)  The  point-spread  function  of  the  imaging  system  (sine"),  (c)  Network  using  two- 
state  neurons  converges  to  a  local  energy  minimum  after  5  cycles,  ^  =  10  (d)  After  5  cycles  using  90  graded  neurons  te 

network  has  not  converged,  but  a  good  estimate  of  the  object  has  emerged,  (3  =  IQ-,  (e)  Object  and  image  containing  5% 
Gaussian  additive  noise,  (f)  Neural  reconstruction  from  (e)  after  50  cycles,  (3  =  lO'"  Note  that  it  is  nearly  indistinguishable 
from  the  SVD  result. 


However,  if  we  adopt  graded  neurons  it  is  still  possible 
to  use  very  simple  elements,  yet  circumvent  the  finite  step 
limitation.  Since  these  elements  can  take  on  a  continuum 
of  values  between  two  limits,  the  energy  is  forced  to  de¬ 
crease  at  each  step.  A  serially  threshold  network,  con¬ 
structed  from  these  elements  can  therefore  reach  the  global 
energy  minimum  after  a  sufficient  number  of  cycles.  One 
could  also  dispense  with  the  coding  scheme,  and  use  a 
smaller  number  of  more  complex  elements. 

We  would  like  to  operate  the  network  in  the  synchro¬ 
nous  mode  to  make  efficient  use  of  the  network’s  paral¬ 
lelism.  If  the  kxh  neuron  changes  by  Awf. 

+  =  wl"’  -f  Aw* 

where  convergence  is  assured  provided  that  the  conditions 
of  Section  VI  are  met. 

A  regularized  form  of  the  Gerchberg-Papoulis  algo¬ 
rithm  reads  [20] -[22] 

+  =  A'^g  +  [(1  -  /?)/  -  ALf]w'"’ 

=  w*"^  -f  (rw'"’  +  b). 


Thus,  if 

Aw*  =  (Tw*”^  -f  b)k 

parallel  operation  of  the  network  will  result  in  a  compu¬ 
tation  which  is  identical  to  the  regularized  Gerchberg-Pa¬ 
poulis  algorithm.  Since  the  latter  always  converges  [22], 
this  choice  for  the  Aw*  must  always  be  possible.  Optimal 
selection  of  the  A  w*  to  accelerate  convergence  of  the  net¬ 
work  in  the  parallel  mode  is  under  investigation. 

VIII.  Computational  Complexity  of  Neural 
Algorithm 

The  computational  complexity  associated  with  image 
reconstruction  or  superresolution  using  the  singular  value 
decomposition  of  A  to  solve  (8),  and  using  the  neural  net¬ 
work  approach,  has  been  examined  for  one-dimensional 
images  (Table  I) .  The  computational  load  associated  with 
the  neural  network  is  independent  of  whether  the  thresh¬ 
olding  is  serial  or  parallel,  although  the  actual  computa¬ 
tional  time  is  obviously  less  for  parallel  thresholding.  The 


1522 


IEEE  TRANSACTIONS  ON  SIGNAL  PROCESSING,  VOL.  39,  NO.  7,  JULY  1991 


TABLE  I 

The  Number  of  Operations  Required  for  Superresolution  by  SVD  Inversion  and  by  a  Neural 
Optimization  are  Listed  for  the  Total  Calculation,  and  for  Updating  Either  the  Image  or  the 

Regularization  Parameter 


Requirement 

Operation 

SVD 

Neural 

New  j6 

Mults 

+  N 

KN^  +  (K  +  l)N 

KN^  +  (K  +  \)N 

Adds 

N^ 

Divs 

N 

N 

New  image 

Mults 

2N^  +  N 

(K  +  l)N^  +  («■  +  l)N 

Adds 

2N^ 

(K  +  \)N^  +  KN 

Divs 

N 

N 

Total  operations 

Mults 

Adds 

\5N^  +  +  2N 

+  (K+  \)N^  +  {K+  i)N 
+  KN^  +  KN 

Divs 

N 

N 

number  of  additions,  multiplications,  and  divisions  for 
each  technique  are  listed  in  Table  I  for  three  situations. 
The  first  row  gives  the  number  of  calculations  needed  for 
a  new  value  of  the  regularization  parameter  (3;  the  neural 
network  has  a  disadvantage  in  this  case  because  it  must 
generally  run  for  some  K  iterations.  The  second  row  gives 
the  computational  cost  involved  in  updating  the  input  im¬ 
age  data  vector  g.  The  neural  network  once  again  is  at 
somewhat  of  a  disadvantage.  However,  by  examining  the 
total  number  of  operations  from  the  beginning,  one  can 
see  that  the  neural  approach  is  substantially  more  efficient 
because  it  calculates  a  matrix  product  once  without  the 
overhead  associated  with  singular  value  decomposition; 
numerical  experiments  using  a  microcomputer  indicate 
that  the  number  of  operations  grows  as  15N^  for  an  N 
point  image.  This  is  clearly  increasingly  significant  for 
larger  images. 

IX.  Discussion  and  Conclusions 

Neural  network  solutions  to  image  restoration  problems 
are,  therefore,  competitive  with,  but  not  necessarily  bet¬ 
ter  than,  more  traditional  methods  for  solving  the  problem 
(see  also  [22]).  We  have  shown  that  both  binary  and  non¬ 
binary  image  reconstruction  algorithms  can  be  imple¬ 
mented  on  very  similar  neural  architectures.  The  nonbi¬ 
nary  case  can  be  based  upon  network  elements  which  take 
discrete  (typically  two-state)  or  continuous  values.  In  the 
present  application,  the  diagonal  of  the  connection  matrix 
is  nonzero  and,  for  the  discrete  element  case,  the  crucial 
thresholding  step  was  modified  in  order  to  ensure  that  the 
energy  of  the  network  decreases  at  each  step;  the  thresh¬ 
olding  step  is  unnecessary  for  the  continuous  case. 

Thus,  convergence  to  a  minimum  of  the  energy  func¬ 
tion  is  guaranteed  for  the  both  the  discrete  and  continu¬ 
ous-element  networks.  However,  this  minimum  is  unique 
only  in  the  continuous  case,  since  discretization  of  the 
element  states  introduces  local  minima.  Convergence  has 
been  demonstrated  in  this  paper  only  for  the  case  of  serial 
thresholding;  for  parallel  thresholding  in  the  continuous 
case,  the  computation  reduces  to  the  regularized  Gerch- 
berg-Papoulis  algorithm  for  a  particular  choice  of  the  in¬ 
crements  in  the  states  of  the  network  elements. 
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ABSTRACT 

We  present  an  analysis  and  computational  results  relating  to  the  regularized  restoration  of  subpixel  information  from 
undersampled  data.  The  method  makes  use  of  a  small  set  of  images  in  various  stages  of  defocus.  An  iterative  implementation 
permits  the  incorporation  of  a  non-negativity  constraint.  The  problem  we  consider  is  fundamentally  under-detemiined,  but 
useful  results  can  be  obtained  in  reasonably  low  noise  conditions. 


L  INTRODUCTION 


The  investigations  discussed  here  form  part  of  a  program  whose  subject  is  the  enhancement  of  images  obtained  from  space- 
based  remote  sensors.  For  the  present  purpose,  these  images  are  assumed  to  consist  of  quantized  data  from  a  fixed  two- 
dimensional  set  of  sensors,  such  as  a  CCD  array.  Typically  for  these  arrays,  the  Airy  disc  is  smaller  than  one  pixel.  Thus,  for 
reasonably  fast  and  well-corrected  optics,  the  conventional  limit  on  system  resolution  is  likely  to  be  the  result  of  the  spatial 
sampling  associated  with  the  pixel  size;  i.e.,  the  integration  of  the  light  energy  in  the  image  over  the  area  represented  by  each 
pixel. 

For  clarity,  we  consider  the  case  of  one-dimensional  imaging  with  an  incoherent  source.  Let  the  system  point  spread  function 
(psO  be  represented  by  the  continuous  imaging  operator  L  .  Then,  for  an  isoplanatic  system  in  the  absence  of  noise,  the 
image  g  of  an  object  f  is  given  by  tlie  convolution 

g(y)  =  Is  L(x  -  y)  f(x)  dx  (1) 

where  S  is  the  support  of  f  .  The  output  of  the  detector  (pixel),  extending  from  Y^.  to  Yj,+|,is 


gk  = 


=  Iy^.  dy  L(x  -  y)  f(x)  dx 


Interchanging  the  order  of  integration  (which  is  certainly  permissible  with  the  physical  functions  considered  here)  and  making 
the  definition 


ak(x) 


we  obtain  for  the  pixel-integrated  image 

gk  =  Is^kW  fW  dx,  k=l,2, -,  K 


(2) 


0-8194 -0694 -5/9 1  /$4. 00 
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(3) 


We  now  discretize  the  object  into  a  set  {fj ,  j  =  1, 2  •••,  J}  over  equal  intervals  and  write,  as  an  approximation, 

j 

g,j  =  Z  a^j-fj,  k=l,2, -  sK 

j-i 

where  ajy-  is  the  integral  of  a^fx)  over  the  j*  interval.  In  matrix  form,  equation  (3)  can  be  written 

g  =  Af 


where  A  is  a  K  x  J  matrix. 

Since  the  a^j  can  be  determined  from  the  system  psf  and  the  pixel  array  structure,  the  deconvolution  problem  represented  by 
equation  (3)  can,  in  principle,  be  solved  and  the  set  {fj}  reconstracted  approximately  from  an  equal  set  of  measurements  of 
the  gk- 


In  practice,  the  restoration  problem  is  made  much  more  difficult  because  of  background  and  intrinsic  noise,  array 
imperfections  and  errors  or  uncertainties  in  the  knowledge  of  the  optical  properties  of  the  system.  When  pixel  integration  is 
over  a  significant  part  of  the  system  psf,  achieving  even  a  modest  degree  of  enhancement,  using  a  single  image,  poses 
intractable  difficulties  in  the  presence  of  these  perturbations.  However,  the  necessary  additional  information  can  be  derived 
from  multiple  differing  images  of  the  given  object.  After  a  brief  discussion  of  the  characteristics  of  the  inverse  problem 
represented  by  equation  (3),  we  shall  give  a  detailed  description  of  a  specific  method  for  acquiring  this  information. 

2.  REGULARIZED  IMAGE  RESTORATION 

There  are  many  applications  that  require  the  restoration  of  a  signal  or  image  from  a  limited  discrete  data  set;  for  example, 
samples  of  the  spatial  or  temporal  spectrum,  or  of  the  object's  low  or  band-pass  filtered  image.  An  important  a  priori 
assumption  in  image  restoration  is  that  the  object  or  objects  are  of  compact  support.  This  leads  to  the  well-known  result  that 
their  spectra  are  bandlimited  functions.  In  principle,  therefore,  one  might  hope  to  extend  limited  spectral  data  by  means  of 
analytic  continuation,  and  then,  by  Fourier  transformation,  obtain  an  enhanced  image.  This  procedure  is  notoriously  unstable 
in  the  presence  of  noise  and  does  not  provide  a  practical  solution  to  the  problem. 

This  central  difficulty  can  be  expressed  in  another  way.  Most  signal  or  image  recovery  problems  can  be  described  by  linear 
equations  of  the  form 

g(x)  =  J  A(x,y)  f(y)  dy 

where  A  is  the  system  point  spread  function  or  the  Fourier  transform  kernel,  for  example.  The  interpretation  of  the  data  g(x) 
to  obtain  information  about  the  object  f(y)  requires  the  solution  of  a  linear  inverse  problem.  This  is  equivalent  to  finding  the 
solution  of  a  Fredholm  integral  equation  of  the  first  kind.  It  is  well-known  in  such  a  case  that  small  fluctuations  in  the  data 
g(x)  can  lead  to  very  large  fluctuations  in  the  estimate  of  the  unknown  function  f(y).  This  is  a  manifestation  of  the  ill-posed 
nature  of  the  problem  (the  inverse  operator  is  unbounded)  and  some  method  of  stabilization  is  needed  to  determine  useful 
solutions. 

The  problem  of  a  lack  of  continuous  dependence  on  the  data  can  be  overcome  by  one  of  the  various  techniques  of 
regularization  [1,  2].  Essentially,  the  ill-posed  problem  is  replaced  by  a  related  well-posed  one,  chosen  to  be  physically 
meaningful  and  to  possess  the  necessary  properties  of  convergence  and  stability.  Thus  we  may  change  the  concept  of  a 
solution,  or  the  Hilbert  spaces  of  which  the  object  and  image  are  elements,  or  their  topologies,  or  the  operator  itself.  The 
technique  we  shall  use  belongs  to  the  last  category. 

We  impose  physically  reasonable  constraints  on  the  permitted  solutions.  If  e  is  a  measure  in  the  norm  sense  of  the  noise  in 
the  image  (norm  in  the  appropriate  Hilbert  space  is  denoted  by  II  •  1 1  h  )  and  if  C  is  some  constraint  operator  with  E  a 
known  bound,  we  shall  require  that  all  possible  reconstmctions  f  satisfy 
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M  g  -  A/'  1 1 G  ^  £  and  1 1  C/'  M  p  ^  E 

C  may  be  used,  for  example,  to  impose  smoothness  on  the  reconstruction  or  to  weight  the  reconstmction  support.  If  C  is  the 
identity  operator,  E  is  a  bound  on  the  norm  of  the  reconstruction.  We  combine  the  constraints  quadratically  and  minimize  the 
functional 


Ilg-A/’ll^  +  p  II  C/-  lip 

where  p  =  Note  that  smaller  values  of  P  are  equivalent  to  demanding  greater  fidelity  between  the  reconstruction  and  the 
data;  greater  values  place  more  emphasis  on  the  property  of  the  reconstruction  associated  with  C.  In  the  present  discussion  we 
shall  take  C  =  L  The  minimizer  fp  can  then  be  expressed  in  either  of  the  forms 


/p  =  (A*A  +  pl)-i  A*g 
or 

/p  -  A*  (AA*  +  pi)-i  g 


(4) 


where  A*  is  the  operator  adjoint  to  A.  The  inverses  of  the  bracketed  operators  will  always  exist,  since  the  eigenvalues  of  the 
symmetric  operators  A*A  and  AA*  are  non-negative.  The  operator,  the  image  and  the  reconstruction  will  consist  in  practice 
of  finite  arrays  and  equation  (4)  becomes  a  matrix  equation. 


3.  SUB-PIXEL  RESOLUTION  FROM  MULTIPLE  DEFOCUSSED  IMAGES 


We  can  state  the  deconvolution  problem  in  the  more  general  case  where  we  are  given  a  set  of  R  differing  noisy  images  of  the 
same  object  over  the  same  pixel  array. 

Then  the  r^  image  consists  of  the  vector  of  pixel  outputs  given  by  the  equation 

g^'')  =  Aj.  f  +  n^*")  (5) 

where  n(^)  represents  some  additive  noise  vector.  (Other  forms  of  noise  can  be  accommodated  by  appropriate 
modifications  of  equations  (4)  and  (5),  but  we  shall  not  address  this  question  here.)  We  are  required  to  estimate  f  from  the 
set  {gn  r=l,2,**-,R}. 

We  obtain  the  solution  by  assembling  the  image  set  into  one  composite  image,  and  the  corresponding  matrices  of  integrated  psf 
samples  a^  into  a  single  imaging  matrix.  The  regularized  solution  is  then  again  given  by  equation  (4).  The  appropriate  value 
for  the  regularization  parameter  p,  which  is  closely  related  to  the  signal-to-noise  ratio  in  the  data,  can  be  estimated  in  several 
ways;  for  example,  by  the  method  of  weighted  cross-validation  [3]. 

The  image  set  required  to  achieve  sub-pixel  resolution  can  be  derived  by  various  means.  Stark  and  Oskoui  [4]  discuss  an 
object  reconstruction  technique  which  uses  a  set  of  images  differing  from  one  another  by  rotation  or  lateral  translation  of  the 
pixel  array.  It  is  evident  that  acquiring  a  sequence  of  data  sets  by  lateral  displacement  of  the  detector  (or  image)  through  some 
fraction  of  a  pixel  at  each  step  allows  one  to  sample  the  image  as  finely  as  is  desired.  (For  the  two-dimensional  image,  the 
displacements  can  be  in  any  direction.)  Rotational  displacement  similarly  permits  arbitrarily  fine  sampling.  They  consider  the 
specific  case  in  which  the  system  psf  is  much  smaller  than  a  pixel,  so  that  resolution  in  the  data  is  governed  almost  completely 
by  pixel  integration  rather  than  the  optical  properties  of  the  system. 

An  alternative  method,  which  we  shall  adopt,  is  to  use  an  image  set  obtained  with  differing  point-spread  functions.  These 
could  be  generated  by  separate  optical  systems,  or  conceivably  at  different  wavelengths.  The  system  psf  can  also  be 
conveniently  altered  by  varying  the  degree  of  defocus;  implementations  of  this  method  could  depend  on  a  single  detector 
translated  into  the  chosen  planes  of  defocus,  or  a  system  of  beamsplitters  and  detector  arrays  in  appropriate  locations. 

If  the  reconstmction  procedure  is  to  be  effective,  the  various  images  must  contain  significantly  different  information,  which 
implies  that  the  point  spread  functions  must  differ  appreciably  over  the  scale  of  a  pixel.  We  shall  assume  that  the  images  are 
formed  on  the  same  array,  or  arrays  with  identical  characteristics,  and  that  the  object  field  is  spatially  and,  if  necessary. 
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temporally  invariant  from  one  image  to  another.  We  shall  also  assume  that  the  point  spread  functions  are  accurately  known 
and,  for  the  purposes  of  the  algorithm  used  later,  that  the  images  are  formed  from  incoherent  radiation.  We  do  not  require  that 
any'of  the  point  spread  functions  are  much  smaller  than  a  pixel;  for  the  illustration  presented  below,  the  psf  at  focus  is  about 
half  the  size  of  a  pixel.  In  a  practical  implementation,  the  images  would  not  be  centered  on  the  pixel  arrays  in  exactly  the  same 
way;  i.e.,  there  would  be  some  lateral  translation  between  the  images.  The  effect  of  including  lateral  translations  has  not  yet 
been  investigated,  but  one  would  not  expect  reconstmction  quality  to  suffer  under  these  circumstances. 

4.  ILLUSTRATION 

To  illustrate  these  ideas,  we  include  some  results  from  a  numerical  simulation.  The  object  field  consists  of  a  group  of  delta- 
function-like  incoherent  radiators.  We  shall  show  that  from  a  small  set  of  images,  one  at  focus,  the  others  at  various  stages  of 
defocus,  object  locations  can  be  well  recovered,  even  when  several  of  the  objects'  geometrical  images  lie  within  a  single  pixel. 
The  algorithm  is  based  on  the  regularized  formula  of  equation  (4);  in  addition,  the  calculation  is  iterated  a  small  number  of 
times  to  enforce  non-negativity  on  the  reconstruction. 

The  reconstmction  is  made  initially  into  a  spatial  region  defined  by  the  central  lobe  of  the  focussed  image,  but  using  a  finer 
grid.  No  prior  assumptions  are  made  about  the  locations  or  the  number  of  objects  within  this  region.  For  small,  relatively 
isolated  sources,  some  ringing  will  occur  in  the  reconstmction,  with  associated  negative  pixel  values.  The  support  is 
progressively  refined  by  eliminating  these  pixels  at  each  iteration  until  an  entirely  positive  reconstmction  is  obtained.  The 
smallest  object  space  which  is  consistent  with  the  image  data  will  yield  the  best  reconstmction,  and  the  problem,  initially 
underdetermined,  becomes  finally  an  overdetermined  one. 

A  modified  form  of  this  scheme,  which  would  be  appropriate  for  more  extended  objects,  includes  a  weight  matrix  which 
biasses  the  next  iteration  against  those  pixels  with  negative  values.  The  computation  is  significantly  slower  in  this  case,  since 
the  size  of  the  reconstmction  space  remains  constant.  We  also  note  the  possibility  of  using  a  regularized  form  of  the  non¬ 
negative  least-squares  algorithm  of  Lawson  and  Hanson  [5].  The  relative  performance  of  this  procedure,  also  of  course 
iterative,  has  not  been  fully  evaluated. 

In  this  example,  the  object  field  consists  of  eleven  highly  localized  sources  of  equal  intensity,  distributed  over  a  3x3  block  of 
image  pixels;  see  Figure  1.  (It  should  be  noted  that  the  reconstmction  grid  used  did  not  coincide  with  the  object  grid;  hence  the 
objects  cannot  appear  as  single-pixel  "points"  in  the  reconstmction.)  The  central  lobe  of  the  system  psf  was  about  half  the 
width  of  a  pixel.  Four  independent  images  containing  equal  energy  were  generated  over  a  7x7  block  of  pixels,  the  first 
corresponding  to  a  focussed  system,  the  others  at  various  stages  of  defocus.  A  method  was  devised  for  choosing  defocus 
conditions  with  significantly  different  information  content.  Starting  with  the  focussed  image,  with  associated  imaging  matrix 
Aq,  we  wish  to  find  a  defocussed  image  whose  imaging  matrix  Aj  is  as  independent  as  possible  from  Aq.  The  criterion  used 
for  this  purpose  was  the  magnitude  of  the  condition  number  (the  ratio  of  the  largest  to  the  smallest  singular  values)  of  the 
matrix  formed  from  the  center  columns  of  Aq  and  Aj.  The  range  of  defocus  over  which  the  search  was  made  was  from  0  to  4 
waves.  The  combination  selected  was  that  possessing  the  smallest  condition  number.  Knowing  Aq  and  Aj,  A2  was 
determined,  and  finally  A3.  The  merit  functions  (reciprocals  of  condition  numbers)  calculated  for  combinations  of  two,  three 
and  four  defocus  levels  are  shown  in  Figure  2.  Note  that  a  zero  occurs  whenever  the  variable  degree  of  defocus  coincides  with 
that  corresponding  to  one  of  the  other  imaging  matrices;  then  the  composite  matrix  is  singular  and  its  condition  number 
bcomes  unbounded.  The  degrees  of  defocus  chosen  for  the  four  images  in  this  case  were  of  magnitude  0,  0.8,  1.6  and  2.4 
waves.  The  corresponding  images  are  shown  in  Figures  3-6.  The  only  readily-identifiable  feature  in  the  focussed  unage  is  that 
the  central  pixel  is  brighter  than  the  surrounding  ones. 

Reconstructions  were  performed  over  the  region  defined  by  the  central  block  of  3x3  pixels,  with  sampling  seven  tunes  as  fine 
as  that  in  the  data.  Thus  the  reconstmction  space  initially  consisted  of  441  points,  while  the  four  images  provided  a  total  of 
196  data  points.  The  reconstmctions  obtained  when  each  of  the  images  was  cormpted  by  additive  Gaussian  noise  with 
standard  deviation  equal  to  1%  of  the  mean  pixel  content  is  shown  in  Figure  7.  All  of  the  objects  are  located  close  to  their  tme 
subpixel  positions.  Figure  8  shows  the  result  obtained  when  the  noise  level  is  increased  to  5%.  The  reconstmction  is  still 
generally  accurate,  although  there  is  now  some  distortion  in  object  location  and  one  or  two  small  artifacts  have  begun  to 
appear.  A  signal-to-noise  ratio  can  be  defined  as  the  ratio  for  each  image  of  the  sum,  on  a  pixel-by-pixel  basis,  of  the  signal 
power  to  the  sum  of  the  noise  power.  At  the  5%  level,  this  quantity  varied  between  33  and  28  dB.  This  example  was  designed 


368  /  SPIE  Vo/.  1566  Advanced  Signal  Processing  Algorithms,  Architectures,  and  Implementations  II  (1991) 


to  be  reasonably  ehallenging,  and  spreading  the  objects  further  apart,  or  reducing  their  number,  considerably  increases  the 
algorithm's  robustness  against  noise. 


5.  CONCLUSIONS 

A  method  for  recovering  detail  at  the  sub-pixel  level  from  a  small  set  of  images  in  various  stages  of  defocus  has  been  discussed 
and  demonstrated,  using  an  algorithm  which  incorporates  regularization  to  counter  the  destabilizing  effects  of  noise.  The 
iterative  version  of  this  algorithm,  which  permits  the  inclusion  of  a  non-negativity  constraint,  is  particularly  effective  at 
recovering  accurate  object  support  estimates.  It  is  therefore  an  appropriate  technique  to  use  when  it  is  known,  a  priori,  that  the 
object  field  consists  largely  of  small  well-separated  targets.  The  sensitivity  to  noise  of  the  method  deserves  more  detailed 
investigation.  A  quantitative  comparison  of  its  performance  with  methods  which  make  use  of  laterally  translated  or  rotated 
image  sets  would  also  be  of  considerable  interest.  In  practice  there  would  inevitably  be  some  lateral  shift  between  images, 
even  if  they  are  formed  on  the  same  array,  and  a  hybrid  scheme  might  prove  to  be  the  most  robust  in  the  presence  of  noise. 
Ultimately,  of  course,  the  performance  of  any  restoration  algorithm  must  be  limited  by  the  information  content  of  the  image 
set.  What  should  be  sought,  therefore,  is  the  encoding  scheme  which  most  efficiently  exploits  the  total  information  carried  by 
the  incident  radiation. 
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Fig.  1. 
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Reciprocal  of  condition  no.  of  joint  matrix 


Fig.  2.  Merit  functions 
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Fig.  3.  Image  at  0.0  waves  defocus 


Fig.  4.  Image  at  0.8  waves  defocus 
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Fig.  5.  Image  at  1.6  waves  defocus 


Fig.  6.  Image  at  2.4  waves  defocus 
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Fig.  7.  Reconstruction  from  image  with  1% 
added  noise  (beta  =  5  x  10'^) 
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ABSTRACT 

A  new  algorithm  for  the  restoration  of  extended  images,  the  Regularized  Pseudoinverse  Deconvolution  (RAPID) 
algorithm,  is  proposed.  The  algorithm  consists  of  expanding  the  regularized  pseudoinverse  of  the  imaging  operator 
into  a  sequence  of  terms  which  can  be  easily  implemented  using  Fourier  processing  techniques.  The  first  term  of  the 
expansion  is  closely  related  to  generalized  Wiener  filtering  if  the  point  spread  function  is  shift-invariant.  The  other 
terms  in  the  expansion  are  correction  terms  which  are  small  when  the  point  spread  function  is  shift-invariant,  as  is  the 
case  with  many  imaging  systems.  Even  when  the  point  spread  function  of  the  imaging  system  is  space-variant,  such  as 
with  a  partially  obscured  imaging  system  or  a  system  with  severe  aberrations,  the  correction  terms  are  both  few  in 
number  and  easily  implemented. 


1.  INTRODUCTION 

In  the  absence  of  any  other  degrading  effects,  the  performance  of  an  optical  system  is  ultimately  restricted  by 
diffraction.  The  finite  extent  of  the  entrance  pupil  imposes  a  fundamental  upper  limit  on  the  system's  spatial  frequency 
response.  The  image  quality  of  most  operational  systems  will  not,  however,  approach  this  theoretical  limit  very  closely. 
It  is  possible  that  the  design  or  construction  will  be  flawed,  as  in  the  case  of  the  Hubble  Space  Telescope,  through 
defective  manufacture,  assembly  or  quality  assurance  procedures.  In  addition,  aging  of  components  will  almost 
certainly  compromise  sensor  performance  at  some  stage  in  its  lifetime,  and,  as  in  spaceborne  operation,  replacement 
may  not  be  a  simple  task.  The  detector  itself  may  impose  limitations;  for  example,  where  a  CCD  array  is  used, 
information  is  lost  in  the  inter-pixel  areas,  and  image  energy  is  integrated  over  the  active  area  of  each  pixel.  Other 
degrading  factors  will  include  defective  pixels,  noise  in  the  CCD  array  and  electronic  subsystems,  and  a  possibly 
obtrusive  background.  The  methods  of  image  restoration  considered  here  were  originally  aimed  at  achieving 
performance  beyond  the  diffraction  limit  \  but  are  in  fact  capable  of  compensating  simultaneously  or  separately  for 
aberrations  induced  by  the  optical  components  and  for  the  limitations  of  the  detector.  They  arc  inherently  robust  and 
possess  valuable  noise-suppressing  properties. 
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Th0  assumption  is  made  that  the  overall  effect  of  the  optics  can  be  described  as  a  possibly  time-  and  shift-variant 
blurring  of  the  image  due  to  diffraction  and  aberrations.  Thus,  at  any  instant  of  time,  the  point  spread  function  may 
change  across  the  sensor  field-of-view.  It  can  be  assumed,  however,  that  at  any  given  point  in  the  image  the  point 
spread  function  is  effectively  determined  by  its  spatial  location  and  is  time-invariant  over  the  integration  time  of  the 
sensor  and  then  undergoes  a  change  at  a  later  time.  This  is  the  case  with  the  Hubble  Space  Telescope.  It  will  be 
assumed  that  the  set  of  point  spread  functions  is  known  or  can  be  measured.  For  the  Hubble  wide-field  planetary 
camera  there  are  two  primary  components  to  the  point  spread  function;  one  due  to  diffraction  by  the  aperture 
obstructions  and  the  other  due  to  spherical  aberration.  The  point  spread  function  is  observed  to  be  locally  shift- 
invariant,  and  the  image  can  be  considered  to  be  created  by  the  summation  across  the  entire  field-of-view  of  the 
segmented  set  of  localized  point  spread  functions  convolved  with  objects  in  the  corresponding  parts  of  the  field. 

However,  for  a  simple  refracting  telescope  designed  for  observation  of  extended  objects,  the  aberrations  causing 
image  degradation  can  be  highly  space-variant.  For  an  extended  object,  image  segmentation  introduces  undesirable 
edge  effects  at  the  block  boundaries  when  locally  shift-invariant  approximations  of  the  globally  shift-variant  point 
spread  function  are  used  to  process  the  separate  blocks.  In  addition,  post  processing  interpolation  or  iteration  is 
required  to  smooth  the  block  boundaries  in  the  final  composite  image  In  the  approach  proposed  in  this  paper,  the 
point  spread  function  is  allowed  to  vary  continuously  over  the  whole  image  and  a  special  decomposition  of  the  image 
reconstruction  operator  is  performed  which  permits  Fourier  techniques  to  be  used  to  process  the  entire  image. 

The  overall  optical  system  imaging  equation  can  be  written  as  a  Fredholm  integral  equation  of  the  first  kind. 
Solutions  to  ill-posed  problems  of  this  type  are  known  to  be  numerically  unstable  Additionally,  it  is  anticipated  that 
the  image  will  be  spatially  sampled  by  a  solid-state  sensor  which  will  introduce  spatial  integration,  discretization  and 
associated  noise  processes.  Thus,  after  scanning  the  image  into  a  vector,  the  integral  representing  the  continuous  image 
can  be  rewritten  as  a  matrix  expression.  In  general,  the  presence  of  the  sensor  noise  takes  the  measured  image  vector 
out  of  the  span  of  the  columns  of  the  kernel  matrix,  which  is  typically  highly  ill-conditioned.  Thus,  when  it  is  desired  to 
estimate  what  the  image  would  have  been  in  the  absence  of  aberrations  or  with  less  diffraction  than  the  instrument  can 
provide,  techniques  derived  from  regularization  theory  are  required  to  restore  stability  to  the  reconstruction.  By 
introducing  a  suitable  error  criterion  (based  on,  e.g.,  a  vector  norm),  images  can  be  constructed  which  are,  in  terms  of 
the  chosen  criterion,  closer  to  the  undistorted  geometrical  image  of  the  object  than  the  detected  image  data. 

To  the  extent  permitted  by  the  noise  in  the  image,  in-band  effects  can  usually  be  removed  by  some  form  of 
pseudoinverse  filter  However,  detector  pixellation  and  the  finite  aperture  of  any  system  set  resolution  limits  not  so 
easily  overcome,  and  a  method  for  achieving  spectral  extrapolation  has  to  be  devised.  The  spatial  spectrum  of  the 
object  is  the  Fourier  transform  of  its  amplitude,  in  the  coherent  case,  or  its  intensity,  in  the  incoherent  case.  If  the  object 
is  known  to  be  of  finite  extent,  its  Fourier  transform  is  an  analytic  function,  and  the  out-of-band  part  of  the  spectrum 
can  in  principle  be  fully  recovered  by  analytic  continuation  ^  of  the  image  spectrum  after  removal  of  any  in-band 
distortion.  The  inverse  Fourier  transform  of  this  extended  spectrum  would  then  yield  a  perfect  image  of  the  original 
object.  Equivalently,  one  could  attempt  to  solve  directly  the  equation  describing  the  imaging  process.  This,  however, 
involves  the  inversion  of  an  ill-conditioned  matrix,  and  the  restoration  process  is  intrinsically  unstable,  even  small 
amounts  of  noise  rendering  the  results  meaningless.  These  difficulties  may  be  surmounted  by  applying  the  methods  of 
regularization  theory  developed  to  deal  with  ill-posed  problems  of  this  type;  the  solution  is  derived  by  means  of  a 
constrained  least-squares  procedure  in  which  a  regularization  parameter  plays  an  essential  role.  Stability  in  the 
restored  image,  which  is  computed  via  the  regularized  pseudoinverse  of  the  imaging  matrix,  is  controlled  by  this 
parameter.  Its  optimal  value  depends  on  the  signal-to-noise  ratio  in  the  data. 
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2.  NATURE  OF  THE  PROBLEM 


We  wish  to  estimate  the  object  /  from  an  image  g,  given  that 

g  =  Af+r  (1) 

where  A  is  the  imaging  operator  and  r  represents  the  corrupting  effect  of  additive  noise.  For  clarity  in  the  analysis, 
we  consider  the  one-dimensional  case  with  the  operator  A  given  in  integral  form  by 

{Af)  iy)  =  \  A{x,y)f{x)dx,  c<y<d.  (2) 

In  the  absence  of  noise,  Eq.(l)  becomes  a  Fredholm  equation  of  the  first  kind,  in  which  the  unknown  function  appears 
only  under  the  integral  sign.  We  can  identify  the  sources  of  difficulty  in  solving  this  equation  in  the  presence  of  noise  or 
other  perturbations  (such  as  computer  round-off  error)  by  means  of  a  singular  function  analysis®. 

We  expand  the  kernel  of  the  integral  in  terms  of  the  singular  functions  (x)  and  v.  (y) ,  orthonormai  systems  in 
object  and  image  spaces  respectively,  and  the  singular  values  a.: 

OO 

A(x,y)  =  u.ix)  v.(y).  (3) 

i  =  1 

The  object  and  image  can  be  expanded  in  the  forms 

OO 

1  =  1 

OO 

^(y)  =  ^j(y) 

1  =  1 


where  the  coefficients  are  related  to /(x)  and  g(y)  by  the  integral  formulae 


*b 

fi  = 

f(x)u.(x)  dx 

(6) 

and 

'  a 

rrf 

^(y)^j(y)rfy' 

(7) 

In  the  noiseless  case  (r  =  0) ,  we  find  from  Eqs.  (2)  and  (6) 

OO 

(Af)  (y) 

f  =  1 

(8) 
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It  follows,  using  Eqs.(5)  and  (8),  that 


and  hence 


9.  =  CT.  f. 

z  h 


(x). 


(9) 


(10) 


i  =  1 


Thus  the  object  function  can  in  principle  be  perfectly  reconstructed  from  the  set  [g^]  of  image  coefficients. 

Now  consider  the  effects  of  noise.  1 
the  noise  to  the  new  image  coefficients: 


Now  consider  the  effects  of  noise.  By  expanding  r(y)  in  terms  of  the  (y)  ,  we  can  derive  the  contribution  of 


The  estimate  of  f(x)  is  now 


-  a.  f.  +  r.. 
'  I  I  h  i 


fM  =  fM  +  X  ^  “i 

Z  =  1  ^ 


(x), 


(11) 


(12) 


Image  formation  is  characteristically  described  by  an  integral  transform  of  convolution  type,  z.e.,  A  is  a 
convolution  operator.  Its  singular-value  spectrum  typically  decays  asymptotically  at  an  exponential  rate^.  Since  the  r. 
will  in  general  decrease  less  .quickly,  the  sum  in  Eq.(12)  will  be  divergent  and  no  bound  will  exist  for  the  'distance'  (in 
the  sense  of  some  appropriate  metric)  between  the  true  object  and  the  reconstruction.  The  effect  of  the  noise  on  the 
reconstructed  image  is  a  manifestation  of  the  fact  that  convolution  is  a  strongly  smoothing  process  -  closely  similar 
images  can  correspond  to  widely  differing  objects.  Thus  image  restoration  is  an  ill-posed  problem,  small  perturbations 
in  the  data  causing  large  changes  in  the  solution  represented  by  Eq.(12). 

3.  THE  REGULARIZED  SOLUTION 

The  methods  of  regularization  theory  can  be  exploited  to  convert  the  problem  to  a  related  well-posed  one,  z.e., 

one  for  which  the  solution  exists,  is  unique  and  depends  continuously  on  the  data.  Since  we  shall  later  be  concerned 

with  computed  reconstructions,  we  henceforth  consider  the  problem  in  its  finite  discretized  form;  thus  the  imaging 

operator  A  becomes  a  matrix.  Although  strictly  we  should  introduce  new  symbols,  for  convenience  we  continue  to 

use  /,  g,  and  A  to  denote  the  discrete  forms  of  object,  image,  and  imaging  operator.  Generally  feC^.ge  and 
.  X  n 

Ae  C 


To  regularize  the  problem,  we  shall  modify  A.  We  impose  constraints  on  possible  solutions  / '  by  requiring  that 

(13) 

where  e  is  some  suitable  measure  of  the  noise  in  the  image,  and  that 


(14) 
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where  ^  is  some  suitable  measure  of  the  permitted  'signal  strength'  of  the  solution.  (  II  •  I 
Hilbert  spaces  associated  with  object  and  image.)  We  combine  these  constraints  and  minimize 


denotes  norm  in  the 


\Af-gf  +  ^\\  /’  11^ 


where  the  regularization  parameter  p  is  given  by 


P  = 


The  minimum-norm  solution  to  this  constrained  least-squares  problem  is  given  by 

/p  =  ^p^ 


2 

/Ip  =  (A^A  +  ^1)  A^. 


We  note  the  relationship  of  At  (which  we  shall  call  the  regularized  pseudoinverse)  to  the  Moore-Penrose 


pseudoinverse 


A^  =  lim  /In. 
p->0 


The  inverse  of  {A^A  +  ^I)  always  exists,  since  A^A  is  non-negative  definite  and  the  regularization  parameter  p 
is  positive.  A  value  of  P  should  be  chosen  which  balances  data  fidelity  against  smoothness  in  the  reconstruction. 
.Methods  are  also  available  for  determining  p  from  the  image  data  themselves 

4.  SINGULAR  VALUE  AND  FOURIER  DECOMPOSITIONS 
It  is  often  convenient  to  compute  the  regularized  pseudoinverse  via  the  singular  value  decomposition  (SVD)  of  A: 


where 


A  =  UZV^ 


U^U  =  1  ,  V^V  =  VV^  =  I 
m  n 


I  =  diag  c7.>  0. 


The  singular  values  { a .}  are  assumed  to  have  been  arranged  in  descending  order  of  magnitude 


Then  we  find 
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where 


Zp  =  diag 


a^p’ 

t 


This  representation  is  useful  when  the  behavior  of  the  reconstruction  as  a  function  of  the  regularization  parameter  is 
being  studied.  Regularization  can  equivalently  be  achieved  by  setting  P  to  zero,  and  simply  truncating  the  singular 
value  series  at  a  point  which  is  dependent  on  the  noise  level 

An  ab  initio  computation  of  /p  via  Eq.(23)  requires  the  SVD  of  A  followed  by  two  matrix-vector  multiplications.  If 
the  regularized  pseudoinverse  can  be  precomputed,  only  one  matrix-vector  product  is  needed  to  generate  the 
reconstruction.  For  images  of  more  than  modest  sizes,  however,  the  computation  rapidly  becomes  burdensome.  If,  for 
example,  /  and  g  are  100x100,  A  is  a  lO^-by-lo'^  matrix,  and  the  matrix-vector  product  requires  10  multiplications. 
There  will  also  be  considerable  storage  demands.  Thus,  if  major  computational  resources  are  not  available,  some 
means  of  simplifying  the  calculation  will  be  needed  in  many  cases  of  practical  interest. 

If  the  matrix  A  were  square  circulant  of  order  n(A=  [a-  ,  subscript  mod  n),  we  could  dramatically  reduce 

the  computational  burden  by  exploiting  the  fact  that  the  Fourier  transform  diagonalizes  a  circulant  If  f  denotes  the 
Fourier  matrix: 


1  1 

1  w 


M  1 


1  w 


w 

w 

w 

2 

4 

2  (n-1) 

w 

w 

VO 

n-1 

w  ... 

(n-1)^ 

w 

w  =  exp (ilnj  n) 


w'here 


It  follows  from  Eq.(16)  that 


where 


A  =  f^Af 

A  =  diag  V 


=  f"  a;  f 


Ar.  =  diag  ....  — - - , 

^  l^r  +  P 

V  i 


and  X- is  the  complex  conjugate  of  X..  We  note  that  the  singular  values  {a.}  and  the  eigenvalues  { X.}  are  related  by 

o.  =  1^.1.  (30) 

I  I  i| 

3 

The  operational  counts  for  the  SVD  and  the  FFT  are  O  (n  )  and  O  (nlogn)  ,  respectively. 
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5.  STRUCTURE  OF  THE  IMAGING  MATRIX 


Under  some  circumstances  the  imaging  matrix  can  be  readily  modified  to  circulant  form.  For  a  shift-invariant  one¬ 
dimensional  system,  the  image  is  a  convolution  of  the  point  spread  function  and  the  object.  If  the  sampling  intervals  in 
image  and  reconstruction  spaces  are  equal,  the  matrix  A  is  then  Toeplitz  {A  =  [Sy  _  jl )  •  If/  addition,  the  image  and 
reconstruction  vectors  are  of  equal  length,  A  can  be  padded  to  circulant  form.  In  two  dimensions,  /  and  g  are  matrices 
and  must  be  mapped  into  vectors.  The  manner  in  which  this  is  performed  will  determine  the  structure  of  A.  For 
column-wise  mapping,  for  instance,  again  with  equal  sampling  in  image  and  reconstruction  spaces,  A  becomes  block- 
Toeplitz  with  Toeplitz  blocks.  If  the  image  and  reconstruction  matrices  have  the  same  number  of  elements,  A  can  be 
padded  to  become  block-circulant  with  circulant  blocks,  and  the  problem  is  again  amenable  to  Fourier  transform 
methods.  In  both  one  and  two  dimensions,  it  should  be  noted  that  the  penalty  associated  with  the  expansion  of  A  to 
circulant  or  block-circulant  form  is  a  relaxation  of  the  support  constraint  on  the  reconstruction,  which  renders  the 
calculation,  and  in  particular  the  degree  of  resolution  enhancement  achieved,  much  more  sensitive  to  noise  .  An 
alternative  construction  in  the  two-dimensional  case  is  to  zero-pad /and  g  into  larger  matrices  and  then  to  use  a  linear 
congruential  scan  to  map  the  padded  matrices  into  vectors.  A  then  becomes  a  circulant  matrix  since  the  linear 
congruential  scan  is  an  isomorphism  between  2D  convolution  and  ID  convolution 

For  less  structured  imaging  matrices  {e.g.,  if  the  system  is  weakly  space-variant)  it  may  be  asked  whether 
accelerated  computation  of  the  matrix-vector  product  is  still  possible.  In  this  context,  recent  work  on  circulant 
approximations  to  matrices  of  quite  general  form^°  appears  highly  relevant,  and  includes  the  following  result.  For  any 
matrix,  A^  say,  we  can  write 

a 

Ajt  =  C  +  y  L(x  )C^(y)  (31) 

p  0  m 

m  =  1 


where  C  is  a  circulant  matrix  with  the  same  last  row  as  At,L  (x^)  is  a  lower  triangular  Toeplitz  matrix  with  x^ 
as  its  first  column,  and  C(y^)  is  a  circulant  matrix  whose  last  row  is  y^.The  {x^}  and  {y^^}  may  be  obtained 
from  the  truncated  SVD  of  the  cyclic  displacement  of  A^ : 

a 

At-EAtz'^  =  \  X  /  (32) 

p  p  /  /  m^m 

m  =  1 


where  E  is  the  cyclic  downshift  matrix 


£  = 


0  0  0  ...  0  1 

1  0  0  ...  0  0 

0  1  0  ...  0  0 


0  0  0  ...  1  0 


(33) 


For  imaging  matrices  with  strongly  Toeplitz  features,  a  should  be  a  small  number. 
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6.  THE  REGULARIZED  PSEUDOINVERSE  DECONVOLUTION  ALGORITHM 


Consider  a  two-dimensional  optical  imaging  system  whose  point  spread  function  is  both  time-  and  space- 
invariant.  In  the  presence  of  additive  noise,  the  imaging  equation  connecting  the  input  (extended  object),  the  output 
(degraded  image),  and  the  point  spread  function  (impulse  response)  is  given  by  the  following  two-dimensional 
convolutional  integral  equation 


i  {x,  y) 


0  {x',  y')pix-x\y-y')  dx'dy'  +  n  {x,  y) . 


In  shorthand  notation 


i  -  p'k'ko  +  n 


where  o(x,y)  represents  the  extended  object,  i(x,y)  the  degraded  image,  p(x,y)  the  point  spread  function,  and  n(x,y)  the 
noise.  It  is  assumed  that  the  noise  is  independent  of  position  in  the  image. 

The  discrete  version  of  Eq.(34)  can,  of  course,  be  cast^^  into  the  following  vector-matrix  form 


g  =  r. 


When,  in  particular,  /,  g,  and  r  represent  the  one-dimensional  column  vectors  formed  by  stacking  the  rows  or  columns 
of  the  discretized  versions  of  the  input  o(x,y),  output  i(x,y),  and  the  noise  n(x,y),  respectively,  the  two-dimensional 
imaging  matrix  A  is  block-Toeplitz  with  Toeplitz  blocks.  It  can  be  shown,  using  known  results  that  the  regularized 
pseudoinverse  Ag  is  block-persymmetric  with  persymmetric  blocks.  The  inverse  of  a  Toeplitz  matrix  is  persymmetric, 
and  persymmetric  matrices  obtained  by  inverting  Toeplitz  matrices  have  much  more  Toeplitz-like  structure  than 
general  persymmetric  matrices.  In  particular,  their  displacement  rank  is  the  same  as  that  of  the  parent-Toeplitz 
matrix^^'  Displacement  rank,  it  should  be  noted,  is  a  quantitative  measure  of  the  closeness  of  a  given  matrix  to  being 
Toeplitz.  In  one  dimension,  the  displacement  rank  of  a  Toeplitz  matrix  is  <  2,  and  the  displacement  rank  of  Ap  is  <  4 . 


In  the  early  stages  of  this  investigation,  one  of  the  authors  (R.P.B.)  noted  that  for  a  variety  of  point  spread  functions 
being  studied  the  corresponding  computer-generated  regularized  pseudoinverse  matrices  appeared  to  have  banded 
block-Toeplitz  structure.  He  considered  the  implication  of  this  observation.  In  particular,  if  a  space-invariant  point 
spread  function  used  in  a  two-dimensional  linear  convolutional  imaging  equation  leads  to  a  block-Toeplitz  imaging 
matrix  with  Toeplitz  blocks,  then  the  converse  must  also  be  true.  That  is,  given  a  block-Toeplitz  imaging  matrix 
containing  Toeplitz  blocks,  then  the  corresponding  space-invariant  point  spread  function  which  gave  rise  to  this 
imaging  matrix  could  be  easily  ascertained.  This  implies  that  from  the  regularized  pseudoinverse  an  inverse  point 
spread  function  do  (x,  y)  could  be  constructed  which  could  be  used  to  process  the  image  i(x,y)  and  form  an  estimate 
Op  (x,  y)  of  the  original  object.  This  two-dimensional  linear  convolution  technique  is  summarized  by  the  equation 


8 


The  technique  was  tested  on  a  digital  computer  with  encouraging  results.  Further  experimental  investigations  indicate 
that  results  obtained  with  this  Regularized  Pseudoinverse  Deconvolution  (RAPID)  algorithm  are  comparable  in 
quality  to  those  obtained  using  parametric  Wiener  filtering.  An  error  analysis  is  given  in  Appendix  A.  Results  using 
degraded  images  processed  with  both  the  RAPID  algorithm  and  parametric  Wiener  filtering  are  presented  in  section  8. 

7.  DECONVOLUTION  VIA  WIENER  FILTERING 

Wiener  filtering  is  a  well-known  technique  for  processing  images  degraded  and  corrupted  by  noise  as  described 
by  Eq.(34).  From  a  knowledge  of  the  point  spread  function  p(x,y)  characterizing  the  optical  imaging  system,  it  is 
possible  to  compute  the  corresponding  optical  transfer  function  using  the  two-dimensional  Fourier 

transform 


"  p(.x,y)exp[-i2n(xf^  +  yf^)]dxdy. 


From  the  optical  transfer  function  and  knowledge  of  the  power  spectra  of  object  noise  (/^.,  f^) ,  the 

parametric  Wiener  filter  can  be  constructed,  namely 


_ p(4’Vl _ 

P(/./)fc(/./)P  +  7 


When  Y  =  1 ,  Eq.(39)  reduces  simply  to  the  Wiener  filter.  If  y  is  variable  we  refer  to  this  as  the  parametric  Wiener  filter. 
In  the  absence  of  noise,  either  form  of  the  Wiener  filter  reduces  to  the  ideal  inverse  filter.  When  the  power  spectra  are 
not  known,  which  is  often  the  case  in  practice,  Eq.(39)  can  be  approximated  by 

(/,/)= - ^ ^  '  ■  ■■.  (40) 

From  the  Wiener  filter,  using  either  Eqs.(39)  or  (40),  an  inverse  point  spread  function  Wy{x,y)  can  be  constructed 
using  the  two-dimensional  inverse  Fourier  transform.  That  is. 


Wy{x,y)  = 


^Y  ^4’  V  f ^ ^4  ^  ^  ^4‘^4‘ 


With  the  inverse  point  spread  function  given  by  Eq.(41),  an  estimate  dy{x,  y)  of  the  object  can  be  computed  using 
the  two-dimensional  linear  convolutional  equation 


where,  again,  i{x,y)  is  the  degraded  image. 
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8.  RESULTS 


Preliminary  results  obtained  using  the  RAPID  algorithm  are  presented  in  this  section.  For  purposes  of  comparison, 
results  obtained  using  the  parametric  Wiener  filter  algorithm  are  also  included.  The  optical  system  considered  for  these 
studies  was  a  simple  spherical  converging  lens  as  shown  in  Fig.  1. 


Point  Source 


Figure  1 .  A  Simple  Spherical  Converging  Lens  Imaging  System 

The  object  and  image  planes  are  co planar  and  orthogonal  to  the  optical  axis  of  the  lens.  An  arbitrary  point  source 
located  in  the  object  plane  gives  rise  to  an  intensity  distribution  (the  point  spread  function)  in  the  image  plane.  A 
charge-coupled  device  (CCD),  for  example,  can  be  used  to  measure  the  point  spread  function.  A  ray-trace  program  was 
used  in  the  synthesis  of  four  different  point  spread  functions  dominated  by  spherical  aberration  (Fig.  2),  coma  (Fig.  3), 
astigmatism  (Fig.  4),  and  defocus  (Fig.  5).  The  object  plane  distance  (mm),  image  plane  distance  (mm),  focal  length 
(mm),  F-number,  tangential  field-angle  (deg.),  and  sagittal  field-angle  (deg.)  associated  with  each  of  these  four  figures 
are  summarized  in  Table  1.  The  same  CCD  model  was  used  in  all  simulations.  The  array  consisted  of  a  31-by-31  planar 
arrangement  of  square  detectors  measuring  0.01  mm  on  a  side  with  a  center-to-center  spacing  of  0.01  mm.  Each  point 
spread  function  was  obtained  by  tracing  20,000  rays. 
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Table  1 


Spherical 

Aberration 

Coma 

Astigmatism 

Defocus 

Object  plane  distance 

48.6 

48.6 

48.6 

48.6 

Image  plane  distance 

52.0 

52.0 

50.4 

51.0 

Focal  length 

24.3 

24.3 

24.3 

24.3 

F-number 

4.80 

5.70 

5.70 

4.00 

Tangential  field-angle 

0.00 

2.87 

5.91 

0.00 

Sagittal  field-angle 

0.00 

2.87 

0.00 

0.00 

On  the  top  line,  center  diagram,  of  Figs.  2  through  5  are  mesh  plots  of  the  four  point  spread  functions  considered. 
Each  point  spread  function  p(x,y)  is  represented  by  a  31-by-31  matrix.  The  ideal  extended  object  o(x,y)  used  in  this 
analysis  was  a  256-by-256  matrix  which  is  displayed  as  an  8-bit  gray-level  diagram  in  the  upper-left  hand  comer  of 
each  of  these  figures.  Performing  a  two-dimensional  convolution,  see  Eq.  (35),  of  the  extended  object  with  the  point 
spread  function  (in  the  absence  of  noise)  yields  a  286-by-286  degraded  image  i(x,y)  of  which  the  central  256-by-256 
portion  of  the  degraded  image  is  shown  in  the  upper-right  hand  comer  of  each  of  these  figures.  The  full  286-by-286 
matrix  is  used,  however,  in  subsequent  image  processing  computations. 

On  line  two  of  each  of  the  four  figures  are  mesh  plots  of  the  inverse  point  spread  functions  (31-by-31  matrices) 
dn(x,y)  (left-diagram)  and  w  {x,y)  (right-diagram)  obtained  using  the  RAPID  and  Wiener  filter  algorithms, 
respectively.  Performing  a  two-dimensional  convolution  of  the  point  spread  function  p(x,y)  with  each  of  the  inverse 
point  spread  functions  d^ix,y)  and  w^(x,y)  yields  processed  point  spread  functions  (61-by-61  matrices).  The 
central  31-by-31  portions  o^f  these  processed  point  spread  functions  are  shown  as  mesh  plots  on  line  three  (left-  and 
right-diagrams,  respectively).  For  purposes  of  comparison,  a  31-by-31  null-array  with  a  single  non-zero  entry  for  the 
center  pixel  (Delta)  is  also  displayed  on  line  three,  center  diagram. 

The  values  of  the  regularization  parameters  (p  and  y)  which  gave  rise  to  the  best  processed  point  s|)read  functions, 
judged  visually,  for  this  analysis  are:  spherical  aberration  (OandO),  coma  (3x10  and  1x10  ),  astigmatism 

(2xl0~^and2xl0~^),  and  defocus  (5xl0~^and  1x10”^).  These  same  regularization  parameter  values  were  used  in 
cornputing  the  object  estimates  dp(x,  y)  and  dy{x,y)  using  Eqs.  (37)  and  (42),  respectively.  These  object  estimates 
(316-by-316  processed  images)  are  displayed  as  gray-level  diagrams  on  the  bottom-line  of  each  of  the  four  figures. 
Only  the  central  256-by-256  portions  of  the  processed  images  are  shown. 

The  results  presented  in  Figs.  2  through  5  were  based  on  studies  using  synthesized  point  spread  functions.  We 
were  fortunate  to  obtain  real  digitized  degraded  images  of  the  planet  Saturn  taken  with  the  wide-field  planetary 
camera  of  the  Hubble  Space  Telescope.  The  upper  diagram  in  Fig.  6  shows  a  400-by-250  degraded  (unprocessed)  image 
of  the  planet  Saturn.  The  second  line  in  Fig.  6  shows  two  31-by-31  degraded  images  of  different  stars  (called  star  #1  and 
star  #2)  also  taken  with  the  wide-held  planetary  camera.  The  RAPID  algorithm  was  used  to  process  the  degraded 
image  of  Saturn  using  the  two  star  images  as  point  spread  functions  characterizing  the  degradation  process.  In 
particular,  star  #1  image  was  used  as  the  input  point  spread  function.  Both  star  #1  and  star  #2  images  were  first 
processed  using  the  inverse  point  spread  function  obtained  using  the  star  #1  image  only.  The  regularization  parameter, 
P,  selected  was  the  one  which  gave  rise  to  processed  star  images  of  equal  quality,  based  on  a  minimum  entropy 
criterion.  This  same  inverse  point  spread  function  was  then  used,  via  Eq.(35),  to  process  the  degraded  Saturn  image. 
The  size  of  the  reconstructed  image  was  430-by-280.  The  central  400-by-250  portion  of  this  reconstruction  is  shown  in 
the  lower  diagram  of  Fig.  6. 
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Figure  6.  Hubble  Space  Telescope  Imagery 


9.  CONCLUSIONS 


A  new  algorithm,  the  Regularized  Pseudoinverse  Deconvolution  (RAPID)  algorithm,  has  been  developed  and  has 
been  shown  to  be  equivalent  in  performance  to  Wiener  filtering  for  shift-invariant  point  spread  functions.  The 
algorithm  can  be  implemented  either  by  direct  convolution  or  indirectly  using  Fast  Fourier  Transform  (FFT) 
techniques.  The  advantage  of  this  approach  is  that,  through  the  use  of  linear  congruential  scanning  and  the  application 
of  the  circulant  expansion  of  equation  (31),  regularized  reconstruction  methods  can  be  applied  to  extended  images 
degraded  by  shift-variant  point  spread  functions. 
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APPENDIX  A  -  TRADEOFFS  BETWEEN  ACCURACY  AND  SPEED 


In  this  appendix  we  examine  the  RAPID  reconstruction  scheme  in  the  continuous  domain.  An  expression  for  the 
error  displays  a  compromise  between  accuracy  and  speed.  Let  A  denote  the  convolution  operator  on  L  (51  ) : 


Afix)  =  (a-kf)  (X)  =  I  a{x-y)f(y)dm{y) . 


Let  ^  denote  the  two-dimensional  Fourier  transform  and  let  M.  denote  the  multiplication  operator:  M.cp  =  dif.  As 

is  well-known  (see  reference  A1  below),  the  Fourier  transform  diagonalizes  the  convolution  operator: 

^  =  M.  ,with  II  All  =  II  all  .  In  this  formalism,  the  regularization  operator  satisfies: 

d  °° 

A—I  +a  m 

(|fl| 

The  RAPID  operator  is  obtained  by  determining  the  regularization  pseudoinverse  and  then  restricting  it  to  a  region 
near  the  origin.  Introduce  the  associated  convolution  operator 


where  denotes  the  boxcar  window:  14;^  (x)  =  1  if  x  e  [-h,h]  x  [-b, !?],  zero  otherwise.  The  regularization 
operator  for  is 

A  —1  .4-  A  j.  , 

Ai  n  —  r\  - 

^  Wj^A(|fl|  -hP)  a 

Then  the  error  between  the  regularization  operator  and  the  RAPID  operator  is 


a  ,  a 

— - - w.  ★ — r - 

i-|2  „  ^  ih2  o 


As  b  tends  to  infinity,  the  error  tends  to  zero.  However,  a  large  value  for  b  corresponds  to  an  increased  computational 
load.  Thus,  a  trade  off  must  be  made  between  speed  and  accuracy.  In  addition,  the  error  bound  also  indicates  that  a 
different  window  (such  as  a  square  Hamnaing)  might  improve  the  accuracy  of  the  regularization  scheme. 
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